This section contains reference documentation for the regexpReplace function
Find and replace a string or regexp pattern with a target string or regexp pattern. If matchStr is not found, inputStr will be returned. By default, all occurrences of match pattern in the input string will be replaced. Default matching mode is case sensitive.
regexpReplace(inputStr, matchRegexp, replaceRegexp)
regexpReplace(inputStr, matchRegexp, replaceRegexp, matchStartPos)
regexpReplace(inputStr, matchRegexp, replaceRegexp, matchStartPos, occurrence)
regexpReplace(inputStr, matchRegexp, replaceRegexp, matchStartPos, occurrence, flag)
inputStrThe input string or the column name on which regexpReplace function should be applied.
matchRegexpThe regular expression or string used to match against the input string or column value.
replaceRegexpThe regular expression or string to replace if a match is found.
matchStartPosIndex of inputStr from where matching should start. Counting starts and 0. Default value is 0 if not specified.
occurrenceControls which occurence of the matched pattern must be replaced. Counting starts at 0. Default value is -1 if not specified
flagSingle character flag that controls how the regex finds matches in inputStr. If an incorrect flag is specified, the function applies default case sensitive match. Only one flag can be specified. Supported flags are:
i -> case insensitive match
In the example below, shows a simple string find and replace example where all occurrences of the matched string o is replaced with string x.
select regexpReplace('foo', 'o', 'x') AS value
from myTablefxx
The example below shows how a regexp pattern containing consecutive digits is found and replaced with a simple string bar.
select regexpReplace('foo123', '[0-9]+', 'bar') AS value
from myTablefoobar
The example below shows how a regexp pattern containing consecutive non-digits is found and replaced with a simple string bar.
select regexpReplace('foo123', '[^0-9]+', 'bar') AS value
from myTablebar123
The following example demonstrates how replaceStr can contain backreferences to substrings captured by the matchStr regular expression. Backreferences are indicated by $n where n can range from 0-9. In the example below, every character in the input is replaced by the character appended with a space.
select regexpReplace('foo', '(.)', '$1 ') AS value
from myTablef o o
This example shows how regexpReplace can be used to remove extra whitespaces between words in an input string.
select regexpReplace('Pinot is blazing fast', '( ){2,}', ' ') AS value
from myTablePinot is blazing fast
This example shows the power of backreferencing can be used in regexpReplace to format phone numbers.
select regexpReplace('11234567898','(\\d)(\\d{3})(\\d{3})(\\d{4})', '$1-($2) $3-$4') AS value
from myTable1-(123) 456-7898
This example shows how the matchStartPos parameter can be used. Since the matchStartPos is set to 4, pattern matching against the inputStr begins at index 4 there by leading to the string healthy not being replaced.
select regexpReplace('healthy, wealthy, stealthy and wise','\\w+thy', 'something', 4) AS value
from myTablehealthy, something, something and wise
This example shows how the occurence parameter can be used. In the example below, the matchStr regular expression matches against three instances in the input - healthy, wealthy and stealthy. As the occurence is specified to 2, the second occurence (counting from zero) stealthy is replaced with something
select regexpReplace('healthy, wealthy, stealthy and wise','\\w+thy', 'something', 0, 2) AS value
from myTablehealthy, wealthy, something and wise
The example below shows the usage of the flag parameter. Here the case insensitive flag i is specified.
select regexpReplace('healthy, wealthy, stealthy and wise','\\w+THY', 'something', 0, 0, 'i') AS value
from myTablesomething, wealthy, stealthy and wise
The examples below show some sample queries using regexpReplace in there WHERE clause of a query.
SELECT col1, col2
FROM myTable
WHERE regexpReplace(stateCode, '[VC]A', 'TEST') = 'TEST'SELECT count(*)
FROM myTable
WHERE contains(regexpReplace(stateCode, '(C)(A)', '$1TEST$2'), 'CTESTA')