Skip to content

Conversation

@alexpacio
Copy link

@alexpacio alexpacio commented Oct 30, 2025

This PR introduces a robust benchmarking tool for measuring VM boot times with high precision. The tool is supposed to measure the entire vm's lifecycle execution time, starting from the QEMU execution to the moment when a message, echoed by the vm into the virtiocon port, is received by the host.
The tool relies on efficient ways to detect the UNIX socket readiness. Unfortunately I couldn't use startnb.sh because I need to create the UNIX socket beforehand and attach the QEMU process to it as a client, in order to enable a deterministic approach of accessing to the socket's msg.

Also, a new service type is introduced to satisfy this specific need, so called "benchmark".

It could be interesting to have a GH action in order to continuously measure the execution times for each build.

The only things in WIP here are:

  • test in macos
  • sysctl.conf is not being written in the service's image file by the build process, thus i'm not sure if the boot time is currently affected by the lack of that sysctl directive

@iMilnb please take a look and let me know if you think it should be revised in any way. Thank you

@iMilnb
Copy link
Contributor

iMilnb commented Oct 31, 2025

Very nice indeed! Don't you think it would be easier to maintain if those changes were merged into startnb.sh as it uses 90% code of it? I think about a "benchmark" flag and if present, either execute your code or source it from an include.

@alexpacio
Copy link
Author

Sure can do this!

@alexpacio
Copy link
Author

I've just pushed a revised version of the script, while being just an untested draft yet. Actually it has been moved from a dedicated script to helper functions used in startnb.sh when flags are passed.
Please @iMilnb let me know if this looks good to you so that I can test it and complete the implementation.

@iMilnb
Copy link
Contributor

iMilnb commented Nov 2, 2025

That's almost perfect, could you revamp the documentation to reflect the changes? Thanks!

@gcavelier
Copy link
Contributor

Here is the result on MacOS (arm64) :

❯ ./startnb.sh -i images/rescue-evbarm-aarch64.img -k kernels/netbsd-GENERIC64.img -B
Socket ready: measure_boot.sock
Starting VM at Tue Nov  4 07:59:57 CET 2025
➡️ using console: com
➡️ using QEMU version 10.1.2
host socket 1: sHmXyz61cp1.sock
➡️ booting image images/rescue-evbarm-aarch64.img with kernel kernels/netbsd-GENERIC64.img

=========================================
Boot time: 0.038115000 seconds
=========================================

scripts/benchmark.sh: line 31: 26092 Terminated: 15          eval $cmd > /dev/null 2>&1

Not sure if the last line is normal

@iMilnb
Copy link
Contributor

iMilnb commented Nov 5, 2025

Here is the result on MacOS (arm64) :

❯ ./startnb.sh -i images/rescue-evbarm-aarch64.img -k kernels/netbsd-GENERIC64.img -B
Socket ready: measure_boot.sock
Starting VM at Tue Nov  4 07:59:57 CET 2025
➡️ using console: com
➡️ using QEMU version 10.1.2
host socket 1: sHmXyz61cp1.sock
➡️ booting image images/rescue-evbarm-aarch64.img with kernel kernels/netbsd-GENERIC64.img

=========================================
Boot time: 0.038115000 seconds
=========================================

scripts/benchmark.sh: line 31: 26092 Terminated: 15          eval $cmd > /dev/null 2>&1

Not sure if the last line is normal

@alexpacio can you take a look at this please?

@alexpacio
Copy link
Author

alexpacio commented Nov 5, 2025

Yes, I'll take a look today. I've just tested it on linux x86-64 and works there. However i've just found out two things to improve:

  • the unix sockets files being created for the viocon ports that are not up to the benchmarking purposes are not getting cleaned up when the script ends. Wouldn't it be better if the are removed?
  • when the quiet mode and the benchmark flags are passed, the script still prints all the other regular logs, while I'd prefer to have it only printing the benchmark result with the intent of having a predictable way to retrieve the timing from the caller standpoint (eg. if I want to evaluate that timing in a CI/CD pipeline, it'd be better if I don't get the output polluted with unneeded lines). This might require to move the log() function inside startnb.sh and replace the echo function where it is actually used. At this point the quiet flag is going to suppress any log regardless of the benchmark flag.

Do these changes look good to you @iMilnb ?
After that I'm going to test and fix it on macos arm64.

@alexpacio
Copy link
Author

alexpacio commented Nov 6, 2025

Hey, are virtio-console multiple ports even supported on MacOS? I feel like they aren't:

./startnb.sh -k kernels/netbsd-GENERIC64.img -i images/mport-evbarm-aarch64.img -n 2
➡️ using console: com
➡️ using QEMU version 10.1.0
➡️ no service name, using UUID (RgqoNRi7)
host socket 1: s-RgqoNRi7-p1.sock
host socket 2: s-RgqoNRi7-p2.sock
➡️ booting image images/mport-evbarm-aarch64.img with kernel kernels/netbsd-GENERIC64.img
[   1.0000000] NetBSD/evbarm (fdt) booting ...
[   1.0000000] NetBSD 11.0_BETA (GENERIC64)    Notice: this software is protected by copyright
[   1.0000000] Detecting hardware...[   1.0000040] entropy: ready
[   1.0000040]  done.
Created tmpfs /dev (1359872 byte, 2624 inodes)
add net default: gateway 10.0.2.2
/dev/ld4a on / type ffs (noatime, read-only, local)
/bin/sh: Can't open /etc/MAKEDEV

Perhaps the netbsd's kernel image for aarch64 is not including the new kernel module with multiport support?

@alexpacio
Copy link
Author

Here's the problem for MacOS:

nm kernels/netbsd-GENERIC64.img          
/Library/Developer/CommandLineTools/usr/bin/nm: error: kernels/netbsd-GENERIC64.img: The file was not recognized as a valid object file

@gcavelier
Copy link
Contributor

Here's the problem for MacOS:

nm kernels/netbsd-GENERIC64.img          
/Library/Developer/CommandLineTools/usr/bin/nm: error: kernels/netbsd-GENERIC64.img: The file was not recognized as a valid object file

Yes, it's expected. This file is not executable as is.

❯ file kernels/netbsd-GENERIC64.img
kernels/netbsd-GENERIC64.img: Linux kernel ARM64 boot executable Image, little-endian, 4K pages

qemu is starting fine and the kernel is booting, so the problem (if any) doesn't come from the kernel file format.

/bin/sh: Can't open /etc/MAKEDEV

/etc/MAKEDEV is not present in the mport image, so yeah it won't work ;)

It's referenced in /etc/rc

@alexpacio
Copy link
Author

alexpacio commented Nov 8, 2025

I've tried to overcome all the reasons why the console mode "com" is chosen instad of "viocon" in startnb.sh in macOS (arm64, Tahoe), but qemu freezes.
I would say that there are more problems than they should be in macOS, so I am going to make this PR working only with x86_64bit linux atm. I don't have cycles and probably the right knowledge to figure out what's happening in macos with virtio-console ports. Also, I am currently not in the need of benchmarking the MacOS flavour because, in my case, its usage is not really time sensitive.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants