When I need to copy a (WordPress) website e.g. as baseline for another project I often use the UpdraftPlus plugin.
From the source website I download the five generated files: Themes, Plugins, Database, Other, Uploads. I upload them to a (freshly new installed) target wordpress website, install Updraft Plus, upload the 5 files and hit restore (I ignore the warning* about the different urlo / domain name).
* I fix this by changing it within the database itself (usually through phpmyadmin).
in the wp*_option table I change the siteurl and the home values.
Next, I install Better Search Replace to do a search replace to chnage leftovers ‘old’ / source urls.

Most of the times this approach works perfect. In some case however I noticed some characters are weird. Seemed most of the times, those were emoji’s.

Sounds like an encoding problem, right?

Bypassing UpdraftPlus and using the unix mysqldump and importing that into the taget website, did not resolve my issue.

However using an extra parameter to the mysqldump seemed to do the trick.

mysqldump --default-character-set=utf8mb4

Importing this file show my emoji’s …. yeah.

So why is it seemingly so that Updraftback Plus (and probably other backup plugins as well) is messing up?

It seems those backup plugins typically check wp-config.php for the encoding setting (for importing / exporting files).

So probably it’s rather a mis-match in wp-config.php. I.e. wp-config.php stating a ‘wrong’ encodig (utf-! instead of utf8mb4 in my case).

/** Database Charset to use in creating database tables. */
define( 'DB_CHARSET', 'utf8' );