Safe Haskell | Safe-Inferred |
---|---|
Language | Haskell2010 |
Data.Text.Lazy.Manipulate
Description
Manipulate identifiers and structurally non-complex pieces of text by delimiting word boundaries via a combination of whitespace, control-characters, and case-sensitivity.
Assumptions have been made about word boundary characteristics inherint in predominantely English text, please see individual function documentation for further details and behaviour.
Synopsis
- takeWord :: Text -> Text
- dropWord :: Text -> Text
- stripWord :: Text -> Maybe Text
- breakWord :: Text -> (Text, Text)
- splitWords :: Text -> [Text]
- lowerHead :: Text -> Text
- upperHead :: Text -> Text
- mapHead :: (Char -> Char) -> Text -> Text
- indentLines :: Int -> Text -> Text
- prependLines :: Text -> Text -> Text
- toEllipsis :: Int64 -> Text -> Text
- toEllipsisWith :: Int64 -> Text -> Text -> Text
- toAcronym :: Text -> Maybe Text
- toOrdinal :: Integral a => a -> Text
- toTitle :: Text -> Text
- toCamel :: Text -> Text
- toPascal :: Text -> Text
- toSnake :: Text -> Text
- toSpinal :: Text -> Text
- toTrain :: Text -> Text
- isBoundary :: Char -> Bool
- isWordBoundary :: Char -> Bool
Strict vs lazy types
This library provides functions for manipulating both strict and lazy Text types. The strict functions are provided by the Data.Text.Manipulate module, while the lazy functions are provided by the Data.Text.Lazy.Manipulate module.
Unicode
While this library supports Unicode in a similar fashion to the underlying text library, more explicit Unicode specific handling of word boundaries can be found in the text-icu library.
Fusion
Many functions in this module are subject to fusion, meaning that a pipeline of such functions will usually allocate at most one Text value.
Functions that can be fused by the compiler are documented with the phrase Subject to fusion.
Subwords
Removing words
O(n) Returns the first word, or the original text if no word boundary is encountered. Subject to fusion.
O(n) Return the suffix after dropping the first word. If no word boundary is encountered, the result will be empty. Subject to fusion.
stripWord :: Text -> Maybe Text #
O(n) Return the suffix after removing the first word, or Nothing
if no word boundary is encountered.
>>>
stripWord "HTML5Spaghetti"
Just "Spaghetti"
>>>
stripWord "noboundaries"
Nothing
Breaking on words
breakWord :: Text -> (Text, Text) #
Break a piece of text after the first word boundary is encountered.
>>>
breakWord "PascalCasedVariable"
("Pacal", "CasedVariable")
>>>
breakWord "spinal-cased-variable"
("spinal", "cased-variable")
splitWords :: Text -> [Text] #
O(n) Split into a list of words delimited by boundaries.
>>>
splitWords "SupercaliFrag_ilistic"
["Supercali","Frag","ilistic"]
Character manipulation
Lowercase the first character of a piece of text.
>>>
lowerHead "Title Cased"
"title Cased"
Uppercase the first character of a piece of text.
>>>
upperHead "snake_cased"
"Snake_cased"
mapHead :: (Char -> Char) -> Text -> Text #
Apply a function to the first character of a piece of text.
Line manipulation
indentLines :: Int -> Text -> Text #
Indent newlines by the given number of spaces.
prependLines :: Text -> Text -> Text #
Prepend newlines with the given separator
Ellipsis
toEllipsis :: Int64 -> Text -> Text #
O(n) Truncate text to a specific length. If the text was truncated the ellipsis sign "..." will be appended.
See: toEllipsisWith
Arguments
:: Int64 | Length. |
-> Text | Ellipsis. |
-> Text | |
-> Text |
O(n) Truncate text to a specific length. If the text was truncated the given ellipsis sign will be appended.
Acronyms
toAcronym :: Text -> Maybe Text #
O(n) Create an adhoc acronym from a piece of cased text.
>>>
toAcronym "AmazonWebServices"
Just "AWS"
>>>
toAcronym "Learn-You Some_Haskell"
Just "LYSH"
>>>
toAcronym "this_is_all_lowercase"
Nothing
Ordinals
toOrdinal :: Integral a => a -> Text #
Render an ordinal used to denote the position in an ordered sequence.
>>>
toOrdinal (101 :: Int)
"101st"
>>>
toOrdinal (12 :: Int)
"12th"
Casing
Boundary predicates
isBoundary :: Char -> Bool #
Returns True
for any boundary character.
isWordBoundary :: Char -> Bool #
Returns True
for any boundary or uppercase character.