Like in many other programming languages, strings are commonly used in Go. However, their use here is quite different from other languages like Java, C++, Python, etc. Let's look at them in detail.
Strings and string declaration
Put simply, a string is a sequence of variable-width characters, each of them represented by one or more bytes using UTF-8 Encoding. It might sound scary, but it's basically about the fact that you can put any existing symbol in it, and we will show you exactly how.
We declare a string in double quotes, and there can be special characters inside these double quotes, such as line breaks or tabs:
// The default value for string is "" or `` - empty string
var myFirstString string
// A string with special characters
var iAmSpecial = "Hello\n\t"
/*
Escape sequence Value
\n A newline character
\t A tab character
\" Double quotation marks
\\ A backslash
*/In case you need a string that retains special characters as plain text, all you have to do is put it in backticks ``:
fmt.Println("ABOBA\n\t") // This string consists of the word ABOBA, a new line and tabulation
/* Output:
ABOBA
*/
fmt.Println(`ABOBA\n\t`) // This string consists of the word ABOBA\n\t
/* Output:
ABOBA\n\t
*/String contents are immutable in Golang, so when we concatenate two strings, it creates a new one in memory:
discover := "hello" // the variable contains the "hello" string
discover = discover + " there" // the variable contains a new "hello there" string;
// the "hello" string soon will be removed from memory
discover = "world" // the variable again has a new value;
// the "hello there" string soon will be removed from memoryThe actual strings like "hello" and "world" are immutable, but you can change the value of a string variable. However, it still means that the string "hello" exists somewhere in memory, and Go will not change the contents of that memory location.
UTF-8
Remember we mentioned the scary word UTF-8? Shortly, it means that you can have a string with symbols from almost any alphabet.
Side note: in standard UTF-8 Unicode, character representations occupy from 1 to 4 bytes.
// UTF-8 from box
var russian = "Привет, Мир!"
korean := "안녕하세요 월드입니다!"
var emoji = "🙋🌍❗"As we've mentioned earlier, each string is a sequence of bytes. Hence, if you need to find out its byte length, you can use the function len(yourStringName).
asciiString := "ABCDE"
utf8String := "БГДЖИ"
fmt.Println(len(asciiString)) // 5
fmt.Println(len(utf8String)) // 10Runes
Go uses rune type values to represent Unicode characters. The Go language defines the type rune as an alias for the type int32, so programs can be clear when an integer value represents a codepoint. Moreover, you can assume that strings are not only sequences of bytes but sequences of runes.
Depending on the use case, strings are commonly regarded as sequences of bytes (encoded in UTF-8) when transferring data and as sequences of runes when it is required to check each individual character of the string.
Remember the example above? If you are interested in the length of a string in characters, use the function RuneCountInString from unicode/utf8 package:
asciiString := "ABCDE"
utf8String := "БГДЖИ"
utf8.RuneCountInString(asciiString) // 5
utf8.RuneCountInString(utf8String) // 5
// Emoji example 🗿
emoji := 🙋🌍❗
len(emoji) // 11
utf8.RuneCountInString(emoji) // 3Conclusion
Now let's briefly remind ourselves how this topic was useful. We have learned that Go supports two styles of string literals: the double-quote style and the back-quote style (or raw string literals). The zero values of string types are blank strings, represented with "" or `` in literal.
Final remark: for safer work with strings, you should convert them into []rune , but we will cover it in detail in another topic.