Category Archives: Uncategorized

The Computer Language Benachmarks Game: Rust ranks #1 for n-body

The Computer Language Benchmarks Game is a free software project for comparing how a given subset of simple algorithms can be implemented in various popular programming languages.

I converted the fastest (dating early 2019) n-body C-implementation (#4) to Rust (#7) in a one-to-one fashion, gaining a performance encreasement by factor 1.6 to my own surprise.

The moment writing, the benachmarks are measured on quad-core 2.4Ghz Intel® Q6600® ; dating back to the year 2006. This ancient hardware supports the SSSE3 64bit-SIMD instructions; both implementations are making use them intensively.

The conversion of the implementation from C to Rust was fun and educational; listing just a few:

  • The pattern of “static global memory” of C had to be transformed to similar code just based on he Rust ownership model.
  • Dealing with memory alignment-directives in Rust
  • Using the early, stable SIMD-API of Rust, namely std::arch::x86_64::*

I must admit that newer test-hardware with support for more advanced AVX-512 SIMD-instructions, would permit to run an even faster Rust-implementation of the n-body task as part of these benchmarks.

Rust crates – frehberg’s annotated catalogue

The base of Rust users and contributors is growing steadily. The amount of libraries (aka crates) at http://crates.io is growing quickly; the overall “noise” is increasing. Some libraries might not be maintained any longer 🙁

This annotated catalogue shall help the Rust-users to find specific, popular, mature Rust crates. This list is WIP (Work In Progress), reflecting my personal shortlist. The ordering in the table top-down doesn’t express any preference.

For more extensive listings, please check out content of https://crates.io (or the categorized search engine https://lib.rs) and corresponding API-docs at https://docs.rs

Other sources of information might be:

NameDescription
nomnom is a parser combinators library written in Rust. Its goal is to provide tools to build safe parsers without compromising the speed or memory consumption.
pestpest is a general purpose parser written in Rust with a focus on accessibility, correctness, and performance. It uses parsing expression grammars (or PEG) as input.
assert_matchesProvides a macro, assert_matches, which tests whether a value matches a given pattern, causing a panic if the match fails.
bytesA utility library for working with bytes.
serdeSerde is a framework for serializing and deserializing Rust data structures efficiently and generically.
serde_jsonSerde JSON serializing and deserializing implementation for Rust.
tokioTokio is an event-driven, non-blocking I/O platform for writing asynchronous applications with the Rust programming language.
romioAsynchronous network primitives in Rust. Romio combines the powerful futures abstractions with the nonblocking IO primitives of mio to provide efficient and ergonomic asynchronous IO primitives for the Rust asynchronous networking ecosystem. Whereas Tokio is using stable Rust-features only, Romio may use so called "unstable" Rust-features from nightly releases.
uomUnits of measurement is a crate that does automatic type-safe zero-cost dimensional analysis.
mockiatoA strict, yet friendly mocking library for Rust
proptestProptest is a property testing framework (i.e., the QuickCheck family) inspired by the Hypothesis framework for Python. It allows to test that certain properties of your code hold for arbitrary inputs, and if a failure is found, automatically finds the minimal test case to reproduce the problem.
test-case-deriveThis crate provides #[test_case] procedural macro attribute that generates multiple parametrized tests using one body with different input parameters.
env_loggerImplements a logger that can be configured via environment variables. env_logger makes sense when used in executables (binary projects). Libraries should use the log crate instead.
gtkRust bindings and wrappers for GLib, GDK 3, GTK+ 3 and Cairo; the popular, platform independent GUI library.
rust-qtDo not use! In contrast to the gtk crate, this GUI library is incomplete and it seems it is no longer maintained.
flutter-engineExperimental! Flutter is Google’s portable UI toolkit for building beautiful, natively-compiled applications for mobile, web, and desktop from a single codebase.
gfxgfx is a high-performance, bindless graphics API for the Rust programming language. It aims to be the default API for Rust graphics: for one-off applications, or higher level libraries or engines.
lazy_staticA macro for declaring lazily evaluated statics in Rust. Using this macro, it is possible to have statics that require code to be executed at runtime in order to be initialized.
regexA Rust library for parsing, compiling, and executing regular expressions. Its syntax is similar to Perl-style regular expressions, but lacks a few features like look around and backreferences. In exchange, all searches execute in linear time with respect to the size of the regular expression and search text. Much of the syntax and implementation is inspired by RE2.
memmapA cross-platform Rust API for memory mapped buffers. May be used to speed-up file-IO.
randA Rust library for random number generation. Rand provides utilities to generate random numbers, to convert them to useful types and distributions, and some randomness-related algorithms.
bastionBastion is a Fault-tolerant Runtime for Rust applications, a supervisor framework as known from Erlang.
test-generatorTest-Generator provides the macros test_resources and bench_resources, executing a test for all items matching a specific glob-pattern, such as "res/tests/*.idl"
test-generator-utestTest-Generator-Utest providing 3-phase testing macro "utest!(setup, test, teardown)"

Rust – Handling Executables and their Debug-Symbols

This post is about compiling Rust-code, the executables, the handling of the corresponding debug symbols, build-ids and core-files. It highlights the importance of debug-symbols for debugging and how to strip the debug-symbols off the binary before shipping to customer.

Let’s re-use the existing cargo Rust-project for the following samples https://github.com/frehberg/rust-releasetag/tree/master/ This project enables us to produce crash core-files. A simplified main.rs looks like

use std::time::Duration; 
use std::thread;
use std::io::stdout;
use std::io::Write;

fn main() {
    println!("Waiting until being aborted");
    loop {
      thread::sleep(Duration::from_millis(200));
      print!(".");
      stdout().flush().ok();
    }
}

The command cargo build will produce a debug binary

-rwxr-xr-x 2 frehberg frehberg 1698344 Mai 14 21:22 target/release/test-tag

And the command cargo build --release will produce a release binary

-rwxr-xr-x 2 frehberg frehberg 1831056 Mai 14 21:22 target/debug/test-tag

Both vary only a few KBytes, still both contain debug-symbols.

The following command will produce a release binary, stripping away the debug symbols

RUSTFLAGS='-C link-arg=-s' cargo build --release

and the resulting binary will be only a fraction of size

-rwxr-xr-x 2 frehberg frehberg 198992 Mai 14 21:27 target/release/test-tag

So why is the tool cargo keeping the debug symbols? Simple answer, those debug symbols are required by debugging tools like gdb, etc. to map the blocks of binary operations onto the original Rust source code. The debug symbols cover all of the Rust code having been compiled into the binary. That might explain the additional size and overhead. This can be demonstrated best with the debug-build, as the release build is optimizing and re-ordering the commands.

$ cargo build
...  
$ gdb target/debug/test-tag 
GNU gdb (Ubuntu 8.1-0ubuntu3) 8.1.0.20180409-git
Copyright (C) 2018 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from target/debug/test-tag...done.
warning: Missing auto-load script at offset 0 in section .debug_gdb_scripts
of file /home/frehberg/src/proj-releasetag/rust-releasetag/test/target/debug/test-tag.
Use `info auto-load python-scripts [REGEXP]' to list them.
(gdb) l
1    use std::time::Duration;
2    use std::thread;
3    use std::io::stdout;
4    use std::io::Write;
5    
6    fn main() {
7        println!("Waiting until being aborted");
8        loop {
9          thread::sleep(Duration::from_millis(200));
10          print!(".");
(gdb) quit

Only stripped binaries should be shipped to customers, but corresponding debug symbols should be kept in the back-hand for debugging purposes. Linux provides command line tools to strip the debug-symbols from executables and binaries.

Note: At this point, it is important to understand that each binary being produced by compiler gcc, or llvm (rustc) is tagged with a unique sha1 BuildId, and this Id can be extracted from binary using the tool file (read gdb docu). The following binary contains the BuildId 11e0989b7ecb6ec4cd87c526a1dcd7ba3a2a81f5

$ file target/release/test-tag 
target/release/test-tag: ELF 64-bit
LSB shared object, x86-64, version 1 (SYSV), 
dynamically linked, interpreter /lib64/l, 
for GNU/Linux 3.2.0, 
BuildID[sha1]=11e0989b7ecb6ec4cd87c526a1dcd7ba3a2a81f5, 
with debug_info, not stripped

To avoid erroneous debugging sessions, the debugging tools do enforce that build-id for executable and debug-symbols are identical. Depending on compiler-version, build-flags, etc. the build-id may change for identical Rust-code.

As little changes (and maybe timestamps) will influence and change the BuildId, a single build-queue should be used to produce a file containing both, debug-symbols and executable code. And at the end of the build-process, these files should be archived and the stripped variants should be derived for delivery.

Lemma: Never embed build-timestamps or other dynamic environment-values into your code, as two builds of identical sources would result in slightly different binaries, containing different BuildIds. It would not be possible to rebuild a specific release-branch of your repo and using such binaries and debug-symbols to analyze a core file from crash-report.

The presence of debug-symbols even in cargo’s release-build permits the archive them together with debug-symbols and to strip the debug-symbols before shipping to customer, as demonstrated in the following code.

cargo build --release
cp target/release/test-tag  test-tag.dbg
strip target/release/test-tag
ls -al test-tag.dbg target/release/test-tag
-rwxr-xr-x 2 frehberg frehberg  198992 Mai 14 22:38 target/release/test-tag
-rwxr-xr-x 1 frehberg frehberg 1502400 Mai 14 22:38 test-tag.dbg

The tool file will proof the executable no longer contains any debug symbols. Please note that both files will contain the identical BuildID[sha1]` `11e0989b7ecb6ec4cd87c526a1dcd7ba3a2a81f5

file target/release/test-tag
target/release/test-tag: ELF 64-bit LSB shared object, 
x86-64, version 1 (SYSV), dynamically linked, 
interpreter /lib64/l, for GNU/Linux 3.2.0,
BuildID[sha1]=11e0989b7ecb6ec4cd87c526a1dcd7ba3a2a81f5, stripped

$ file test-tag.dbg 
test-tag.dbg: ELF 64-bit LSB shared object, 
x86-64, version 1 (SYSV), dynamically linked, 
interpreter /lib64/l, for GNU/Linux 3.2.0,
BuildID[sha1]=11e0989b7ecb6ec4cd87c526a1dcd7ba3a2a81f5, with debug_info, not stripped

Extracting debug-symbols exclusively into a separate file can be performed with the following command

objcopy --only-keep-debug target/release/test-tag test-tag.dbg

Note I cannot recommend to archive executable and debug-symbols in separate files, as often the tool gdb is facing problems to merge both back in debugging sessions later, when feeding in two files as following

gdb -s test-tag.dbg -e target/release/test-tag 

Assuming the debug-symbols and the executable are handled in a single file test-tag.dbg it is possible to run and step thru the code with tool gdb, simply with the following command

$ gdb test-tag.dbg 
GNU gdb (Ubuntu 8.1-0ubuntu3) 8.1.0.20180409-git
Copyright (C) 2018 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from test-tag.dbg...done.
(gdb) run
Starting program: /home/frehberg/src/proj-releasetag/rust-releasetag/test/test-tag.dbg 
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Waiting until being aborted
..........^C
Program received signal SIGINT, Interrupt.
0x00007ffff77bbc31 in __GI___nanosleep (requested_time=0x7fffffffd8c0, remaining=0x7fffffffd8c0)
    at ../sysdeps/unix/sysv/linux/nanosleep.c:28
28	../sysdeps/unix/sysv/linux/nanosleep.c: No such file or directory.
(gdb) bt
#0  0x00007ffff77bbc31 in __GI___nanosleep (requested_time=0x7fffffffd8c0, 
    remaining=0x7fffffffd8c0) at ../sysdeps/unix/sysv/linux/nanosleep.c:28
#1  0x000055555555a8f1 in sleep () at src/libstd/sys/unix/thread.rs:153
#2  sleep () at src/libstd/thread/mod.rs:780
#3  0x0000555555558081 in test_tag::main ()
#4  0x0000555555558673 in std::rt::lang_start::{{closure}} ()
#5  0x00005555555627c3 in {{closure}} () at src/libstd/rt.rs:49
#6  do_call<closure,i32> () at src/libstd/panicking.rs:293
#7  0x000055555556454a in __rust_maybe_catch_panic () at src/libpanic_unwind/lib.rs:85
#8  0x000055555556327d in try<i32,closure> () at src/libstd/panicking.rs:272
#9  catch_unwind<closure,i32> () at src/libstd/panic.rs:388
#10 lang_start_internal () at src/libstd/rt.rs:48
#11 0x0000555555558192 in main ()

One more practical detail In case of process-crashes (eg caused by panic!()) the system will dump the process (stack) into a so called core-file, containing the BuildId of the executable, as well as the BuildIds of all dynamic shared-object libraries (so-files), as found on that platform when starting up the process (not the so-files in your build-environment). Tools as gdb will verify for identical build-ids in core file and the debug-symbol files, otherwise ignoring them.

The tool eu-unstrip (debian package elfutils) permits to extract the corresponding build-ids of executable and of the dynamic libraries during runtime (BuildId 11e0989b7ecb6ec4cd87c526a1dcd7ba3a2a81f5 in first line of output). The BuildIds of the dynamic libraries are:

  • e79e03dc6f0672a9832c68270af07f68f649daf8 –> linux-vdso.so.1
  • 64df1b961228382fe18684249ed800ab1dceaad4 –> ld-2.27.so ld-linux-x86-64.so.2
  • etc.
$ eu-unstrip -n --core=core
0x560be742c000+0x231000 11e0989b7ecb6ec4cd87c526a1dcd7ba3a2a81f5@0x560be742c2bc . - /home/frehberg/src/proj-releasetag/rust-releasetag/test/target/release/test-tag
0x7ffca42e3000+0x2000 e79e03dc6f0672a9832c68270af07f68f649daf8@0x7ffca42e37d0 . - linux-vdso.so.1
0x7f38519ba000+0x229170 64df1b961228382fe18684249ed800ab1dceaad4@0x7f38519ba1d8 /lib64/ld-linux-x86-64.so.2 /usr/lib/debug/lib/x86_64-linux-gnu/ld-2.27.so ld-linux-x86-64.so.2
0x7f3850d86000+0x3f0ae0 b417c0ba7cc5cf06d1d1bed6652cedb9253c60d0@0x7f3850d86280 /lib/x86_64-linux-gnu/libc.so.6 /usr/lib/debug/lib/x86_64-linux-gnu/libc-2.27.so libc.so.6
0x7f3851177000+0x217430 f98df367fb1e663c3b1a49ef86b42e9ec66754f2@0x7f38511771d8 /lib/x86_64-linux-gnu/libgcc_s.so.1 - libgcc_s.so.1
0x7f385138f000+0x21e480 28c6aade70b2d40d1f0f3d0a1a0cad1ab816448f@0x7f385138f248 /lib/x86_64-linux-gnu/libpthread.so.0 /usr/lib/debug/.build-id/28/c6aade70b2d40d1f0f3d0a1a0cad1ab816448f.debug libpthread.so.0
0x7f38515ae000+0x207be0 9826fbdf57ed7d6965131074cb3c08b1009c1cd8@0x7f38515ae1d8 /lib/x86_64-linux-gnu/librt.so.1 /usr/lib/debug/lib/x86_64-linux-gnu/librt-2.27.so librt.so.1
0x7f38517b6000+0x203110 25ad56e902e23b490a9ccdb08a9744d89cb95bcc@0x7f38517b61d8 /lib/x86_64-linux-gnu/libdl.so.2 /usr/lib/debug/lib/x86_64-linux-gnu/libdl-2.27.so libdl.so.2

Please, compare the list above with the list of shared-objects the executable depends on

$  ldd target/release/test-tag
	linux-vdso.so.1 (0x00007ffef89b0000)
	libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f543fa37000)
	librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007f543f82f000)
	libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f543f610000)
	libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f543f3f8000)
	libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f543f007000)
	/lib64/ld-linux-x86-64.so.2 (0x00007f543fe6c000)

Now, if you managed to fetch the corresponding debug-symbol files from archive for executable and dynamic librarries, the tool gdb can be used to print the backtrace of the stack the moment the process has been aborted.

Note: The solib-search-path of tool gdbdefines the locations the shared object files are searched for. The default value for the solib-search-path variable is the working directory “.” So, easiest is to collect the specific shared-obejct (so) libs with matching BuildIds, place them in a single folder, and changing to that directory to execute the tool gdb.

$ gdb --core core target/release/test-tag 
GNU gdb (Ubuntu 8.1-0ubuntu3) 8.1.0.20180409-git
Copyright (C) 2018 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from target/release/test-tag...done.
[New LWP 12597]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `./target/release/test-tag'.
Program terminated with signal SIGABRT, Aborted.
#0  0x00007f4afca12c31 in __GI___nanosleep (requested_time=0x7ffd0ddd6190, 
    remaining=0x7ffd0ddd6190) at ../sysdeps/unix/sysv/linux/nanosleep.c:28
28    ../sysdeps/unix/sysv/linux/nanosleep.c: No such file or directory.
(gdb) bt
#0  0x00007f4afca12c31 in __GI___nanosleep (requested_time=0x7ffd0ddd6190, 
    remaining=0x7ffd0ddd6190) at ../sysdeps/unix/sysv/linux/nanosleep.c:28
#1  0x0000560ae96808f1 in sleep () at src/libstd/sys/unix/thread.rs:153
#2  sleep () at src/libstd/thread/mod.rs:780
#3  0x0000560ae967e081 in test_tag::main ()
#4  0x0000560ae967e673 in std::rt::lang_start::{{closure}} ()
#5  0x0000560ae96887c3 in {{closure}} () at src/libstd/rt.rs:49
#6  do_call<closure,i32> () at src/libstd/panicking.rs:293
#7  0x0000560ae968a54a in __rust_maybe_catch_panic () at src/libpanic_unwind/lib.rs:85
#8  0x0000560ae968927d in try<i32,closure> () at src/libstd/panicking.rs:272
#9  catch_unwind<closure,i32> () at src/libstd/panic.rs:388
#10 lang_start_internal () at src/libstd/rt.rs:48
#11 0x0000560ae967e192 in main ()

Note: By default core files will contain process-stacks of main-thread and other threads only. So if the process status or user settings are required to understand the situation or use-case the crash occured, it is handy to store such information on stack (as done for crate releasetag 😉 Changing settings of the OS, also heap-memory may be dumped into the core file. This might tell you more details, but keep in mind that you might receive GByte-core-files.

Summary

Any time an executable shall be shipped to customer, perform the following steps, otherwise if the debug-symbols got lost, you might not be able run the released code in debugger, nor being able to analyze core-files:

  1. At first extract the BuildId from executable(s) and dynamic libraries,
  2. Then archive the executable(s) and dynamic libraries and using the build-ids from previous step as database-keys,
  3. Finally strip the debug-symbols from these executables and dynamic libraries before packaging and shipping to customer.
  4. Receiving a crash-report, extract the build-id from core file, fetch the corresponding binaries containing the debug-symbols, and analyze what caused the crash.

EDIT

Release mode optimization can be configured in Cargo.toml the following way, being used only when calling “cargo build –release”, here adding debug symbols to your binary.

[profile.release]
opt-level = 3
strip = false
debug = true
codegen-units = 1
lto = true

The optimization levels correspond are defined as

The opt-level setting controls the -C opt-level flag which controls the level of optimization. Higher optimization levels may produce faster runtime code at the expense of longer compiler times. Higher levels may also change and rearrange the compiled code which may make it harder to use with a debugger.

The valid options are:

  • 0: no optimizations
  • 1: basic optimizations
  • 2: some optimizations
  • 3: all optimizations
  • "s": optimize for binary size
  • "z": optimize for binary size, but also turn off loop vectorization.

Xpath expression ignoring the namespace

If you need to search an xml document specifying a default namespace, the libraries such as libxml2 etc. have problems to search for xpath expressions.

The following xpath expressions will ignore any namespace, just checking for the local name of the element:

In XPath 1.0

//*[local-name()='name']

Selects any element with “name” as local-name.

In XPath 2.0 you can use:

//*:name

bash-script spinner

The following bash-function is displaying a spinner as long as data can be read from stdin, for example reading from pipe. In the moment the stdin stream is closed, the spinner will terminate.

Credits: It is a modification of the spinner of http://fitnr.com/showing-a-bash-spinner.html

#!/bin/bash

spinner()
{
  ## hide the cursor
  tput civis

  local spinstr='|/-\'
  printf " [%c] " "$spinstr"
  while read -N 30 chunk; do
     printf "\b\b\b\b\b\b"
     local temp=${spinstr#?}
     local spinstr=$temp${spinstr%"$temp"}
     printf " [%c] " "$spinstr"
  done
  ## overwrite the spinner
  printf "\b\b\b\b\b\b "
  ## return to original cursor position
  printf "\b\b\b\b\b\b"
  ## unhide the cursor
  tput cnorm
}

 

This spinner can be used to display a progress when performing an untar or  to show activity in syslog file:

tail -f /var/log/syslog | spinner

Synaptic on Ubuntu 17.10

Synaptic tool is not starting on my Ubuntu 17.10, the window does not appear. It seems that root-user lacks privileges to access the window manager.
The following command will grant access to root to open an X-window on Ubuntu 17.10 for this session only.

xhost +si:localuser:root

Afterwards execute synaptic as super-user (root)

sudo synaptic