Aim at a target file size in MB or pick a bitrate. Optional: trim the start or end, force mono (smaller file, fine for voice), or normalize loudness across files (EBU R128, what streaming services use).
Drop them in the order you want, pick the output format, and you'll get one continuous file.
1.5× sounds noticeably faster, 0.75× noticeably slower. The voice doesn't get high or low — just fast or slow.
Set the duration in seconds. A 2-second fade is the usual sweet spot for podcasts and music.