Split Method
The “split()” method can be used to split words using a user specified separator. It returns a list of splitted words without including the separator. If no separator is specified by the user, whitespace (one or more) is used as a single separator.
For instance, the code below will return “[‘Linux’, ‘Hint’]” as output:
text.split()
The code below will return “[‘LinuxHint’, ‘com’]” as output when “.” is used as separator:
text.split(“.”)
The separator doesn’t have to be a single character. The split method takes two arguments:
- sep: separator to be used for splitting
- maxsplit: number of splits to do
Both these arguments are optional. As mentioned above, if the “sep” argument is not specified, whitespace is used as a separator for splitting. The “maxsplit” argument has a default value of “-1” and it splits all occurrences by default. Consider the code below:
text.split(“.”)
It will return “[‘LinuxHint’, ‘co’, ‘us’]” as output. If you want to stop splitting at the first occurrence of the separator, specify “1” as the “maxsplit” argument.
text.split(“.”, 1)
The code above will return “[‘LinuxHint’, ‘co.us’]” as output. Just specify the number of occurrences where you want the split process to stop as the second argument.
Note that if there are consecutive separators, an empty string will be for returned for the remaining separators after the first split (when “maxsplit” argument is not used):
text.split(".")
The code above will return “[‘LinuxHint’, ”, ‘com’]” as output. In case you want to remove empty strings from the resulting list, you can use the following list comprehension statement:
result = text.split(".")
result = [item for item in result if item != ""]
print (result)
You will get “[‘LinuxHint’, ‘com’]” as the output after running the above code sample.
Note that the “split()” method moves from left to right to split strings into words. If you want to split string from right to left direction, use “rsplit()” instead. Its syntax, usage and arguments are exactly the same as the “split()” method.
If no separator is found in the string while using “split()” or “rsplit()” methods, the original string is returned as the sole list element.
Partition Method
The “partition()” method can be used to split strings and it works identical to the “split()” method with some differences. The most notable difference is that it retains the separator and includes it as an item in the resulting tuple containing splitted words. This is especially useful if you want to divide the string into an iterable object (tuple in this case) without removing any original characters. Consider the code below:
result = text.partition(".")
print (result)
The above code sample will return “(‘LinuxHint’, ‘.’, ‘com’)” as the output. If you want the result to be of list type, use the following code sample instead:
result = list(text.partition("."))
print (result)
You should get “[‘LinuxHint’, ‘.’, ‘com’]” as output after running the above code sample.
The “partition()” method takes only one argument called “sep”. Users can specify a separator of any length. Unlike the “split()” method, this argument is mandatory, so you can’t omit the separator. However, you can specify whitespace as a separator.
Note that the partition method stops at the first occurrence of the separator. So if your string contains multiple separators, the “partition()” method will ignore all other occurrences. Here is an example illustrating this:
result = list(text.partition("."))
print (result)
The code sample will produce “[‘LinuxHint’, ‘.’, ‘co.us’]” as output. If you want to split at all occurrences of the separator and include the separator in the final list as well, you may have to use a “Regular Expression” or “RegEx” pattern. For the example mentioned above, you can use a RegEx pattern in the following way:
text = "LinuxHint.co.us"
result = re.split("(\.)", text)
print (result)
You will get “[‘LinuxHint’, ‘.’, ‘co’, ‘.’, ‘us’]” as output after executing the above code sample. The dot character has been escaped in the RegEx statement mentioned above. Note that while the example above works with a single dot character, it may not work with complex separators and complex strings. You may have to define your own RegEx pattern depending on your use case. The example is just mentioned here to give you some idea about the process of retaining the separator in the final list using RegEx statements.
The “partition()” method can sometimes leave empty strings, especially when the separator is not found in the string to be splitted. In such cases, you can use list comprehension statements to remove empty strings, as explained in the “split()” method section above.
result = list(text.partition("."))
result = [item for item in result if item != ""]
print (result)
After running the above code, you should get “[‘LinuxHint’]” as output.
Conclusion
For simple and straightforward splits, you can use “split()” and “partition()” methods to get iterable types. For complex strings and separators, you will need to use RegEx statements.