![]() ![]() We can ask cut to split lines of text using a specified delimiter. If we ask for bytes 3 and 4, the shell interprets them as “ñ.” echo 'piñata' | cut -c 3 echo 'piñata' | cut -c 4 echo 'piñata' | cut -c 3-4 If we ask for character 3 or character 4 we’re shown the symbol for a non-printing character. These two bytes are used by the displaying program-in this case, the Bash shell-to identify the “ñ.” Many Unicode characters use three or more bytes to represent a single character. These are the bytes highlighted in the hexadecimal table. In the ASCII table, the “ñ” isn’t shown, instead, there are dots representing two non-printable characters. Using the -C (canonical) option gives us a table of hexadecimal digits with the ASCII equivalent on the right. We’ll examine that file with the hexdump utility. We’ve got a short text file containing this line of text: cat unicode.txt The issue is the character “ñ” is actually made up out of two bytes. ![]() echo 'piñata' | cut -c 1-6 echo 'piñata' | cut -c 1-7 To see the whole word we have to ask for the characters from one to seven. It’s a six-letter word, so asking cut to return the characters from one to six should return the entire word. echo 'how-to geek' | cut -c 1,5,8 echo 'how-to geek' | cut -c 8-11 By using the -c (character) option, we tell cut to work in terms of characters, not bytes. In both cases, special care must be taken with complex characters. Using cut with characters is pretty much the same as using it with bytes. ![]() echo 'how-to geek' | cut -b -6 echo 'how-to geek' | cut -b 8. If you use the hyphen without a second number, cut returns everything from the first number to the end of the stream or line. If you use the hyphen without a first number, cut returns everything from position 1 up to the number. ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |