C# Substring Programs

You want to extract several characters from your C# string as another string, which is called taking a substring. There are two overloaded Substring methods on string, which are ideal for getting parts of strings. This document contains several examples and a useful Substring benchmark, using the C# programming language.
=== Substring benchmark that tests creation time (C#) === Based on .NET Framework 3.5 SP1.

New char[] array: 2382 ms Substring: 2053 ms [faster]

Getting first part
Initially here you have a string and you want to extract the first several characters into a new string. We can use the Substring instance method with two parameters here, the first being 0 and the second being the desired length.
=== Program that uses Substring (C#) ===

using System;

class Program { static void Main() { string input = "OneTwoThree";

// Get first three characters string sub = input.Substring(0, 3); Console.WriteLine("Substring: {0}", sub); } }

=== Output of the program ===

Substring: One

Description. The Substring method is an instance method on the string class, which means you must have a non-null string to use it without triggering an exception. This program will extract the first three characters into a new string reference, which is separately allocated on the managed heap.

Use one parameter
Here we see the Substring overloaded method that takes one parameter, the start index int. The second parameter is considered the largest possible, meaning the substring ends at the last char.
=== Example program (C#) ===

using System;

class Program { static void Main() { string input = "OneTwoThree"; // Indexes: // 0:'O' // 1:'n' // 2:'e' // 3:'T'

The Substring method internally causes the runtime to allocate a new string on the managed heap. Middle string sections Here we take several characters in the middle of a C# string and place them into a new string.Substring(3. pass two integer parameters to Substring. The end result is that you extract the last several characters. } } === Output of the program === Substring: TwoThree Description.WriteLine("Substring: {0}".WriteLine("Substring: {0}"..Substring(3). 3).. To take a middle substring. class Program { static void Main() { string input = "OneTwoThree". Console. Console.// 4:'w' . sub). === Example program that uses Substring (C#) === using System. . string sub = input. sub). The program text describes logic that takes all the characters in the input string excluding the first three. string sub = input. You will want each parameter to be a non-zero value to avoid taking all the edge characters.

The program then displays the resulting string that is pointed to by the string reference 'sub'. class Program { static void Main() { string input = "OneTwoThree". === Program that uses Substring for ending characters (C#) === using System. you can develop an extension method that fills this need efficiently." Essentially. See String Slice. This example shows how you can take the last five characters in the input string and get a new string instance containing them. the third through sixth characters.} } === Output of the program === Substring: Two Description of parameters. "I want the substring at index 3 with a length of three. Exclude several characters Here you want to not copy the last several characters of your string. Slicing strings Here we note that you can add an extension method to "slice" strings as is possible in languages such as JavaScript. . The Substring method in C# doesn't use the same semantics as the Slice method from JavaScript and Python. However. The two parameters in the example say.

Exceptions raised Here we look at exceptions that can be raised when the Substring instance method on the string type is called with incorrect parameters. When you try to go beyond the string length. Console. === Program that shows Substring exceptions (C#) === using System. Visit msdn. class Program { static void Main() { string input = "OneTwoThree". sub).Substring(0. you get the ArgumentOutOfRangeException from the internal method InternalSubStringWithChecks.Length .com. They do not say anything that you cannot find from Visual Studio's IntelliSense.string sub = input.WriteLine("Substring: {0}".microsoft. or use a parameter < 0. input. Here we see an example where I trigger the ArgumentOutOfRangeException. . } } === Output of the program === Substring: OneTwo MSDN research Here we note some reference material on the MSDN website provided by Microsoft. The Substring articles I found on MSDN are really awful and not nearly as nice as this document.5).

String.InternalSubStringWithChecks Benchmark Here I wanted to see if taking characters and putting them into a char[] array could be faster than calling Substring.String.Substring(0.InternalSubStringWithChecks System.ArgumentOutOfRangeException System. } } } === Output of the program === System.Substring(-1). } catch (Exception ex) { Console.WriteLine(ex).ArgumentOutOfRangeException System.try { string sub = input. 100). } try { string sub = input. } catch (Exception ex) { Console.WriteLine(ex). if you want to extract only certain . However. My result was that Substring is faster.

This benchmark is based on . The above code is simply a benchmark you can run in Visual Studio to see the performance difference of Substring and char[] arrays. c[0] = s[3]. 3). string x = new string(c). It is best to use Substring when it has equivalent . See figures at top. // Input === Char array method version === char[] c = new char[3].Substring(3. consider the char[] approach shown. // "two" if (x == null) { } === Substring benchmark result === Substring was faster. c[1] = s[4]. === Data tested === string s = "onetwothree".characters.5 SP1. c[2] = s[5]. // "two" if (x == null) { } === Substring version === string x = s. Benchmark notes.NET 3.

without significant performance problems. we saw where to research Substring on MSDN. Additionally. . Substring is very useful and can help simplify your programs. Summary Here we saw several examples concentrated on the Substring instance method with one or two parameters on the string type in the C# programming language. Substring exceptions. Combine it with IndexOf and Split for powerful string handling.behavior. This site contains a useful benchmarking harness located in the "performance" section. information about Slice. and a benchmark of Substring.

Through these examples. If your input string is A.C -Split on the comma to get an array of: "A" "B" "C" Using Split To begin. which are Windows newlines. This example splits on a single character. === Example program for splitting on spaces (C#) === using System.C# Split String Examples You want to split strings on different characters with single character or string delimiters. split a string that contains "\r\n" sequences. For example.B. You already know the general way to do this. we look at the basic Split method overload. Use the Split method to separate parts from a string. . but it is good to see the basic syntax before we move on. we learn ways to use the Split method on the string type in the C# programming language.

// string[] words = s. // . which contains four words.WriteLine(word). } } } === Output of the program === there is a cat Description. foreach (string word in words) { Console. Multiple characters Here we use either the Regex method or the C# new array syntax.class Program { static void Main() { string s = "there is a cat". This will separate all the words. The input string. is split on spaces and the foreach loop then displays each word. The result value from Split is a string[] array. Note that a new char array is created in the following usages.Split(' ').. // // Split string on spaces.. There .

Text. using System.WriteLine(line). foreach (string line in lines) { Console.. // // Split the string on line breaks.Split(value. which is used to remove empty strings. // string[] lines = Regex. // . "\r\n")..RegularExpressions. === Program that splits on lines with Regex (C#) === using System. class Program { static void Main() { string value = "cat\r\ndog\r\nanimal\r\nperson".is an overloaded method with that signature if you need StringSplitOptions. } } } === Output of the program === cat dog animal person . The return value from Split is a string[] array.

RemoveEmptyEntries). . i++) { Console. StringSplitOptions.StringSplitOptions While the Regex type methods can be used to Split strings effectively. '\n' }. Use "RemoveEmptyEntries" // to make sure no empty strings get put in the string[] array. // char[] delimiters = new char[] { '\r'. for (int i = 0. string[] parts = value. i < parts. The Regex Split method is static. the string type Split method is faster in many cases. The next example shows how you can specify an array as the first parameter to string Split. the string Split method is instancebased. === Program that splits on multiple characters (C#) === using System.WriteLine(parts[i]). // string value = "shirt\r\ndress\r\npants\r\njacket".Split(delimiters.Length. // // Use a new char[] array of two characters (\r and \n) to break // lines from into separate strings. class Program { static void Main() { // // This string is also separated by Windows line breaks.

One useful overload of Split receives char[] arrays. } } } === Output of the program === (Repeated two times) shirt dress pants jacket Overview.Length. for (int i = 0. Using string arrays. The string Split method can receive a character array as the first parameter. We can use this as the second parameter to .WriteLine(parts[i]). StringSplitOptions. When two delimiters are adjacent. // parts = value.} // // Same as the previous example.None). we end up with an empty result.Split(new string[] { "\r\n" }. The new string[] array is created inline with the Split call. The RemoveEmptyEntries enum is specified. i < parts. i++) { Console. but uses a new string of 2 characters. This means string array can also be passed to the Split method. Explanation of StringSplitOptions. Another overload of Split receives string[] arrays. Each char in the array designates a new block.

RegularExpressions.Text. using System.WriteLine(s). This example separates words in a string based on nonword characters. === Program that separates on non-word pattern (C#) === using System. The following screenshot shows the Visual Studio debugger. the best way to separate words is to use a Regex that specifies nonword chars. Usually. } .avoid this. See StringSplitOptions Enumeration. man"). Separating words Here we see how you can separate words with Split. class Program { static void Main() { string[] w = SplitWords("That is a cute cat. foreach (string s in w) { Console. It eliminates punctuation and whitespace from the return array.

} /// <summary> /// Take all the words in the input string and separate them. this provides more power than the string Split methods.. /// </summary> static string[] SplitWords(string s) { // // Split on all non-word characters. Here you can separate parts of your input string based on any character set or range with Regex. .ReadLine(). @"\W+").Split(s. // return Regex..Split Method Examples. // @ // \W+ } } special verbatim string syntax one or more non-word characters together === Output of the program === That is a cute cat man Word splitting example. See Regex. // .Console. Returns an array of all the words. Overall.

Clerk. and it is easily dealt with in the C# language.Horse.Wizard.Cat. i.Split('.CEO.Farmer === Program that splits lines in file (C#) === using System. The final comment shows how the file was parsed into the strings.ReadAllLines method here. This code reads in both of those lines. === Contents of input file (TextFile1. foreach (string line in File. foreach (string part in parts) { Console. class Program { static void Main() { int i = 0. parses them.Fish. We use the File.IO. This is called a CSV file.Mouse.txt")) { string[] parts = line. using System.txt) === Dog.ReadAllLines("TextFile1. but you may want StreamReader instead. part).Hyena Programmer.WriteLine("{0}:{1}".').Rancher.Splitting text files Here you have a text file containing comma-delimited lines of values.Cow. . and displays the values of each line after the line number.

and you could use System. Note that directory paths are complex and this may not handle all cases correctly. It is also platform-specific. .} i++.Path. See Path Examples. // For demo only } } } === Output of the program === 0:Dog 0:Cat 0:Mouse 0:Fish 0:Cow 0:Horse 0:Hyena 1:Programmer 1:Wizard 1:CEO 1:Rancher 1:Clerk 1:Farmer Splitting directory paths Here we see how you can Split the segments in a Windows local directory into separate strings.IO. === Program that splits Windows directories (C#) === using System. DirectorySeparatorChar for more flexibility.

The parameters are next checked for validity.class Program { static void Main() { // The directory from Windows const string dir = @"C:\Users\Sam\Documents\Perls\Main". foreach (string part in parts) { Console. Finally. } } } === Output of the program === C: Users Sam Documents Perls Main Internal logic The logic internal to the . The methods call into the overload with three parameters.Split('\\'). Benchmarks .WriteLine(part). and then a for loop combined with Substring to return the array. it uses unsafe code to create the separator list. // Split on directory separator string[] parts = dir.NET framework for Split is implemented in managed code.

} . i < 120. after observing performance problems with regular expressions in other situations. number of delimiters.Compiled). I felt that the second or third methods would be the best. for (int i = 0. RegexOptions. === Strings used in test (C#) === // // Build long string. and total size of the string factor into performance. // _test = string.I tested a long string and a short string.Split option generally performed the worst.Empty. i++) { _test += "ab\r\n".Empty. having 40 and 1200 chars.Split(_test. // _test = string. for (int i = 0. String splitting speed varies on the type of strings. "\r\n". The Regex. i++) { _test += "01234567\r\n". } === Example methods tested (100000 iterations) === static void Test1() { string[] arr = Regex. i < 10. } // // Build short string. The length of the blocks.

} Longer strings: 1200 chars. This article was last updated for . such as entire files.Split(new char[] { '\r'.None).static void Test2() { string[] arr = _test. The benchmark for the methods on the long strings is more even. Smaller is better. . but for long strings it is very fast. This shows the three methods compared to each other on short strings. '\n' }. Method 1 is the Regex method. Regex is slowest. } static void Test3() { string[] arr = _test.RemoveEmptyEntries). For short strings.Split: [2] char[] Split: 3470 ms 1255 ms [fastest] [3] string[] Split: 1449 ms === Benchmark of Split on short strings === [1] Regex.Split(new string[] { "\r\n" }. the Regex method is equivalent or even faster.5 SP1. StringSplitOptions.Split: [2] char[] Split: [3] string[] Split: 434 ms 63 ms [fastest] 83 ms Short strings: 40 chars. This may be because of the compilation time. and it is by far the slowest on the short strings. === Benchmark of Split on long strings === [1] Regex. StringSplitOptions. It may be that for very long strings.NET 3.

// string t = "string to split. See Split String Improvement. Also. the methods that split based on arrays are faster and simpler. i < 10000000. we focus on how you can specify delimiters to the Split method in the C# language. before (C#) === // // Split on multiple characters using new char[] inline. See Split Delimiter Use. } . I show some Split improvements that can improve your program. ok". There is another example of delimiter array allocation on this site. Delimiter arrays In this section.Performance recommendation.' }). and they will avoid Regex compilation.Split(new char[] { ' '. Regex is appropriate. '. i++) { string[] s = t. For programs that use shorter strings. for (int i = 0. This can solve lots of problems on parsing computer-generated code or data. My further research into Split and its performance shows that it is worthwhile to declare your char[] array you are splitting on as a local instance to reduce memory pressure and improve runtime performance. Escaped characters Here we note that you can use Replace on your string input to substitute special characters in for any escaped characters. For somewhat longer strings or files that contain more lines. === Slow version. See Split Method and Escape Characters.

' }. Sometimes. we discuss the explode function from the PHP environment. The new article on this topic implements the logic in the C# language directly.NET Framework has no explode method exactly like PHP explode. '. You can use Split to divide or separate your strings while keeping your code as simple as possible.=== Fast version.Split(c). i++) { string[] s = t. Summary In this tutorial. The . Explode In this part. i < 10000000. ok". char[] c = new char[]{ ' '. we saw several examples and two benchmarks of the Split method in the C# programming language. You can replace explode with the Split method that receives a string[] array. We see that storing the array of delimiters separately is good. but you can gain the functionality quite easily with Split. My measurements show the above code is less than 10% faster when the array is stored outside the loop. after (C#) === // // Split on multiple characters using new char[] already created. Explode allows you to split strings based on a fixed size. See Explode String Extension Method. // <-. } Interpretation. for the most part. . using IndexOf and Substring together to parse your strings can be more precise and less errorprone.Cache this for (int i = 0. // string t = "string to split.

Sign up to vote on this title
UsefulNot useful