String conversions in Rust

Rust is famous for its rigorous type and memory model. Strings in Rust are nothing but trivial. In this short tutorial, I'll show how to convert between different types of strings in Rust.

A list of string types

In standard Rust you have the following types of strings:

  • String is a UTF-8 encoded string, this is the most commonly used string type in Rust
  • str is the string slice, it's almost always used in the borrowed form, e.g., &str. Sometimes you have to specify the lifetime parameter if you intend to use the string slice in the parameter or return values of a function, for example, &'a str.
  • std::ffi::CString is provided by the Rust ffi module, it's a C-style string, which means it's null terminated and does not have internal null characters.
  • std::ffi::CStr represents a borrowed C-string.
  • std::ffi::OString represents a string generated by the OS or output to the OS
  • std::ffi::OStr represents a borrowed C-string.
  • libc::c_char For C interop, you'll need the libc crate, this type reprsents a C character

From String to CString

let s = String::from("random");
let cs = CString::new(s.clone()).unwrap();

Use CString::new method, note this will consume the original String, if you want to keep the input string, you can use clone method, for example: CString::new(s.clone()).

From CString to C char pointer

The C-style string does not contain null bytes and always ends with a null byte. CString provides the into_raw method to get the raw pointer, safe to pass to C APIs requiring a C string.

use libc::c_char;
let p: *mut c_char = cs.clone().into_raw();

From C char pointer to CStr

let cs :CStr = CStr::from_ptr(cp);

From C char pointer to String

use libc::c_char;
let cs :String = CStr::from_ptr(cp).to_owned().into_string().unwrap();

From String to Vec<u8>

Get UTF8 bytes of the string:

let v: Vec<u8> = s.as_bytes().into();

Hex encoded string to Vec<u8>

Use the hex lib to decode a hex-encoded string:

use hex::ToHex;
let data = hex::decode(s_hex).unwrap();

If the input string has 0x prefix, you should remove it first before calling the above method.

From Vec<u8> to String

Convert from UTF8 encoded bytes back to string:

let msg = String::from_utf8_lossy(&msg_data);

Integer types to hex

You can use the LowerHex or UpperHex.

    let i = 1024;
    println!(
        "{} as hex without prefix:                             {:x}",
        i, i
    );
    // 1024 as hex without prefix:                             400
    println!(
        "{} as hex with 0x prefix:                             {:#x}",
        i, i
    );
    // 1024 as hex with 0x prefix:                             0x400
    println!(
        "{} as hex with left padding zeros and without prefix: {:08x}",
        i, i
    );
    // 1024 as hex with left padding zeros and without prefix: 00000400
    println!(
        "{} as hex with left padding zeros and with 0x prefix: {:#08x}",
        i, i
    );
    // 1024 as hex with left padding zeros and with 0x prefix: 0x000400

From the last example, it's obvious that the number of characters also counts the 0x prefix.

U256 to hex string

U256 is a very popular data type in blockchain development, it represents a 256-bit or 32-byte unsigned integer. To convert U256 to a Hex string,

use primitive_types::U256;
let amount = U256::from_dec_str("9999").unwrap();
// If you don't want the `0x` prefix:
let amount_as_hex = format!("{:064x}", amount);
// If you want the `0x` prefix:
let amount_as_hex = format!("{:#066x}", amount);

Address/H160 and hex string conversion

In Ethereum an address is represented as H160, a 160-bit or 20-byte number. To convert it to a Hex string:

use hex::ToHex;
let addr = H160::from_str("0xf000000000000000000000000000000000000000").unwrap();
// ToHex is already done the encoding part for us, we only need to pad zeros on the left side
println!("{:0>64}", addr.encode_hex::<String>());
Comment