<?xml version="1.0" encoding="utf-8"?>
<?xml-stylesheet type="text/xsl" href="../assets/xml/rss.xsl" media="all"?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Roland's Blog (Posts about regex)</title><link>https://blog.rweisleder.de/</link><description></description><atom:link href="https://blog.rweisleder.de/categories/regex.xml" rel="self" type="application/rss+xml"></atom:link><language>en</language><copyright>Contents © 2026 &lt;a href="mailto:roland@rweisleder.de"&gt;Roland Weisleder&lt;/a&gt; </copyright><lastBuildDate>Sun, 26 Apr 2026 10:22:29 GMT</lastBuildDate><generator>Nikola (getnikola.com)</generator><docs>http://blogs.law.harvard.edu/tech/rss</docs><item><title>Java Regex: Dynamic Replacements with Lambda Expressions</title><link>https://blog.rweisleder.de/posts/java-regex-dynamic-replacement/</link><dc:creator>Roland Weisleder</dc:creator><description>&lt;div&gt;&lt;p&gt;Regular expressions are a powerful tool to find patterns in strings.
Static replacements are also relatively easy to implement.
When it comes to dynamic replacements, things get more interesting.&lt;/p&gt;
&lt;section id="static-replacements"&gt;
&lt;h2&gt;Static replacements&lt;/h2&gt;
&lt;p&gt;Suppose we want to convert all ISO dates in a string into the European format DD.MM.YYYY for our users.&lt;/p&gt;
&lt;pre class="code java"&gt;&lt;a id="rest_code_76306180af9e47e1818fb0a8424b3f3f-1" name="rest_code_76306180af9e47e1818fb0a8424b3f3f-1"&gt;&lt;/a&gt;&lt;span class="n"&gt;String&lt;/span&gt; &lt;span class="n"&gt;input&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"Lorem ipsum 2023-11-07 dolor sit 2021-09-14 amet."&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;a id="rest_code_76306180af9e47e1818fb0a8424b3f3f-2" name="rest_code_76306180af9e47e1818fb0a8424b3f3f-2"&gt;&lt;/a&gt;&lt;span class="n"&gt;Pattern&lt;/span&gt; &lt;span class="n"&gt;isoDatePattern&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Pattern&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;compile&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"(\\d{4})-(\\d{2})-(\\d{2})"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;a id="rest_code_76306180af9e47e1818fb0a8424b3f3f-3" name="rest_code_76306180af9e47e1818fb0a8424b3f3f-3"&gt;&lt;/a&gt;&lt;span class="n"&gt;String&lt;/span&gt; &lt;span class="n"&gt;output&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;isoDatePattern&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;matcher&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;input&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="na"&gt;replaceAll&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"$3.$2.$1"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;a id="rest_code_76306180af9e47e1818fb0a8424b3f3f-4" name="rest_code_76306180af9e47e1818fb0a8424b3f3f-4"&gt;&lt;/a&gt;&lt;span class="c1"&gt;// "Lorem ipsum 07.11.2023 dolor sit 14.09.2021 amet."&lt;/span&gt;
&lt;/pre&gt;&lt;p&gt;For a static replacement, we can access each group of a match using the dollar notation.&lt;/p&gt;
&lt;p&gt;But suppose we now want to show our users the dates in a long form, depending on their locale.
The result should look something like this:&lt;/p&gt;
&lt;pre class="code java"&gt;&lt;a id="rest_code_febf431224d64f4888975ac4fcf96e38-1" name="rest_code_febf431224d64f4888975ac4fcf96e38-1"&gt;&lt;/a&gt;&lt;span class="n"&gt;Locale&lt;/span&gt; &lt;span class="n"&gt;locale&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Locale&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;US&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;a id="rest_code_febf431224d64f4888975ac4fcf96e38-2" name="rest_code_febf431224d64f4888975ac4fcf96e38-2"&gt;&lt;/a&gt;&lt;span class="n"&gt;String&lt;/span&gt; &lt;span class="n"&gt;input&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"Lorem ipsum 2023-11-07 dolor sit 2021-09-14 amet."&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;a id="rest_code_febf431224d64f4888975ac4fcf96e38-3" name="rest_code_febf431224d64f4888975ac4fcf96e38-3"&gt;&lt;/a&gt;&lt;span class="n"&gt;String&lt;/span&gt; &lt;span class="n"&gt;output&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="c1"&gt;// ...&lt;/span&gt;
&lt;a id="rest_code_febf431224d64f4888975ac4fcf96e38-4" name="rest_code_febf431224d64f4888975ac4fcf96e38-4"&gt;&lt;/a&gt;&lt;span class="c1"&gt;// "Lorem ipsum November 7, 2023 dolor sit September 14, 2021 amet."&lt;/span&gt;
&lt;/pre&gt;&lt;p&gt;In this case, we cannot simply transform the groups.
Instead, we have to dynamically execute additional code for each match.
Depending on the Java version we are using, we have various options.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://blog.rweisleder.de/posts/java-regex-dynamic-replacement/"&gt;Read more…&lt;/a&gt; (3 min remaining to read)&lt;/p&gt;&lt;/section&gt;&lt;/div&gt;</description><category>java</category><category>regex</category><guid>https://blog.rweisleder.de/posts/java-regex-dynamic-replacement/</guid><pubDate>Thu, 03 Jul 2025 14:15:00 GMT</pubDate></item><item><title>One Step Towards Maintainable Regular Expressions In Java</title><link>https://blog.rweisleder.de/posts/java-regex-named-groups/</link><dc:creator>Roland Weisleder</dc:creator><description>&lt;div&gt;&lt;p&gt;Regular expressions are a great way to extract data from strings.
In Java, we typically use &lt;code class="docutils literal"&gt;Pattern&lt;/code&gt; and &lt;code class="docutils literal"&gt;Matcher&lt;/code&gt;:&lt;/p&gt;
&lt;pre class="code java"&gt;&lt;a id="rest_code_7e644b994f0c4147a53fafda335afd31-1" name="rest_code_7e644b994f0c4147a53fafda335afd31-1"&gt;&lt;/a&gt;&lt;span class="n"&gt;String&lt;/span&gt; &lt;span class="nf"&gt;parseDate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;String&lt;/span&gt; &lt;span class="n"&gt;input&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;a id="rest_code_7e644b994f0c4147a53fafda335afd31-2" name="rest_code_7e644b994f0c4147a53fafda335afd31-2"&gt;&lt;/a&gt;    &lt;span class="n"&gt;Pattern&lt;/span&gt; &lt;span class="n"&gt;datePattern&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Pattern&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;compile&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"(\\d{4})-(\\d{2})-(\\d{2})"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;a id="rest_code_7e644b994f0c4147a53fafda335afd31-3" name="rest_code_7e644b994f0c4147a53fafda335afd31-3"&gt;&lt;/a&gt;    &lt;span class="n"&gt;Matcher&lt;/span&gt; &lt;span class="n"&gt;dateMatcher&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;datePattern&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;matcher&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;input&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;a id="rest_code_7e644b994f0c4147a53fafda335afd31-4" name="rest_code_7e644b994f0c4147a53fafda335afd31-4"&gt;&lt;/a&gt;    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;dateMatcher&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;matches&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;a id="rest_code_7e644b994f0c4147a53fafda335afd31-5" name="rest_code_7e644b994f0c4147a53fafda335afd31-5"&gt;&lt;/a&gt;        &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="n"&gt;IllegalArgumentException&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Invalid date format"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;a id="rest_code_7e644b994f0c4147a53fafda335afd31-6" name="rest_code_7e644b994f0c4147a53fafda335afd31-6"&gt;&lt;/a&gt;    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;a id="rest_code_7e644b994f0c4147a53fafda335afd31-7" name="rest_code_7e644b994f0c4147a53fafda335afd31-7"&gt;&lt;/a&gt;
&lt;a id="rest_code_7e644b994f0c4147a53fafda335afd31-8" name="rest_code_7e644b994f0c4147a53fafda335afd31-8"&gt;&lt;/a&gt;    &lt;span class="n"&gt;String&lt;/span&gt; &lt;span class="n"&gt;year&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;dateMatcher&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;group&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;a id="rest_code_7e644b994f0c4147a53fafda335afd31-9" name="rest_code_7e644b994f0c4147a53fafda335afd31-9"&gt;&lt;/a&gt;    &lt;span class="n"&gt;String&lt;/span&gt; &lt;span class="n"&gt;month&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;dateMatcher&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;group&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;a id="rest_code_7e644b994f0c4147a53fafda335afd31-10" name="rest_code_7e644b994f0c4147a53fafda335afd31-10"&gt;&lt;/a&gt;    &lt;span class="n"&gt;String&lt;/span&gt; &lt;span class="n"&gt;day&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;dateMatcher&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;group&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;a id="rest_code_7e644b994f0c4147a53fafda335afd31-11" name="rest_code_7e644b994f0c4147a53fafda335afd31-11"&gt;&lt;/a&gt;
&lt;a id="rest_code_7e644b994f0c4147a53fafda335afd31-12" name="rest_code_7e644b994f0c4147a53fafda335afd31-12"&gt;&lt;/a&gt;    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="s"&gt;"Year: "&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;year&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="s"&gt;", Month: "&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;month&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="s"&gt;", Day: "&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;day&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;a id="rest_code_7e644b994f0c4147a53fafda335afd31-13" name="rest_code_7e644b994f0c4147a53fafda335afd31-13"&gt;&lt;/a&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/pre&gt;&lt;p&gt;This works, but it´s not great.&lt;/p&gt;
&lt;p&gt;There are a few things I don´t like about regular expressions and this example in particular:&lt;/p&gt;
&lt;ol class="arabic simple"&gt;
&lt;li&gt;&lt;p&gt;The regex itself is already quite cryptic.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Group access via numbers is hard to follow.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Any change to the regex might require me to recount and update group indices.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Honestly, I don’t want to count parentheses every time I tweak something.
It´s error-prone!&lt;/p&gt;
&lt;p&gt;The good news is, there is a better alternative since Java 7.
Surprisingly, it´s still rarely used.
And no, it´s not extracting magic numbers into &lt;code class="docutils literal"&gt;public static final int THREE = 3;&lt;/code&gt; to keep Sonar happy.
(If you’re still on Java 6 or below: let’s talk.)&lt;/p&gt;
&lt;p&gt;&lt;a href="https://blog.rweisleder.de/posts/java-regex-named-groups/"&gt;Read more…&lt;/a&gt; (2 min remaining to read)&lt;/p&gt;&lt;/div&gt;</description><category>java</category><category>regex</category><guid>https://blog.rweisleder.de/posts/java-regex-named-groups/</guid><pubDate>Tue, 13 May 2025 18:00:00 GMT</pubDate></item></channel></rss>