geohernandez
Menu
  • HOME
  • ABOUT
  • CONTACT ME
  • WORK WITH GEO
    • Data Specialist
    • Speaker Events
    • Resume
  • English
    • English
    • EspaƱol
Menu

Python Lambda and Regex – A good team for replacing a string using dictionaries

Posted on December 2, 2020December 2, 2020 by geohernandez

The aim of this post is to show you a specific and useful tip in Python for replacing strings with matched values contained in a dictionary, for this task that could sound trivial but in the practice may represent an interesting challenge.

Our scenario can be described as a string that contains a group of values that match with some keys in our dictionary, nonetheless, we don’t want to do a repetitive replace based on a loop solution which is still valid but it is not the objective of this post.

Here an example with the string variable and the dictionary:

[crayon-681dc5545c2a8021508816/]

As you can see the values contained in the dictionary match with words of the request variable, but in common scenarios, the dictionary is not limited to a few keys, so the approach addressed in this article focus on bringing you outcomes accurately even when we have to deal with hundreds of items.

We need to start with a short explanation of regular expression, it does not belong only to the Python world, in fact, we can found it in many modern language programming, nonetheless, I want to cite the Python documentation for his definition:

“A regular expression (or RE) specifies a set of strings that matches it;”. Source: https://docs.python.org/3/library/re.html

The regular expression(shortened as regex or regexp); also allows us to define a search pattern, that is vital for our use case explained above. The first step will consist of yielding a new dictionary that we will backslash the special symbols in the dictionary’s key to avoid conflicts related to special symbols, this is because Python strings also use the backslash to escape characters.

More details in this link: https://bit.ly/2KRb9Ew

In this case, we will be building a new dictionary comprehension and applying the regular expression escape function to backslash it. Once that we have completed, we are going to compile a regex for use later, the regex will be composed of all the keys which are coming from the formatted dictionary and split them with the symbol “|”, explaining the benefit of compile a regex is out of the scope of this post, but you can find interesting articles about it.

Let me show you the following piece of code where we cover the new dictionary and creation of compiled regex object:

[crayon-681dc5545c2b5031566696/]

Maybe you are asking why was required to have a compiled regex object, but before start to explain to you the main reason, I want to introduce you to a part of the regular expression and which we will be using, it is regex.sub, here the syntax:

re.sub(<regex>, <repl>, <string>, count=0, flags=0)

This function returns a new string as an outcome from performing replacements on a search string, for more information visit this link: https://docs.python.org/3/library/re.html

As the documentation mentioned, inside of regex sub function we can specify <repl> as a function and therefore the regex sub will call this function for each match found, so instead of passing a function, is here where we can use the regex sub together with a lambda expression and I think that this kind of scenario represents a good opportunity to implement a lambda expression.

Returning to the above statement about why a compiled regex object, one of the best answers is because we can reuse the compiled expression and even have the possibility of using the regex sub function, having in one simple line the power to combining lambda function, regex sub and dictionary for getting the desired result.

Let me add the next statement before to have the final script

[crayon-681dc5545c2ba296472849/]

What is new in the previous statement? Well.. probably the first thing that you are asking is about lambda word, it is a keyword which indicates to python that you are defining a lambda function or lambda expression, it can be defined in simple words as a shortcut to create anonymous functions and it yields a function object, in reality, there is nothing special that force you to use lambda, this last is only a syntactically compact way of defining a function and even in many cases not recommended to use it, but probably the example used in this article is one of the few interesting use (in my humble opinion) that deserves attention.

Before to continue, I encourage you to read this helpful article about lambda :https://realpython.com/python-lambda/#first-example

At this point, our Lambda expressions define a bound variable in this case m, immediately later we define the body of the function, remember, lambda at the end of the story is an anonymous function, so here the interesting behavior of this code, we are passing the formatted_parameters which is a dictionary created and which contains the key that needs to match and replaces values into the request with the values of the dictionary, and here is where the regex.sub and compiled object help us to compact and achieve this result.

Remember that regex.sub in this case is able to call a defined function (in our case the Lambda function) for every match delimited for m.group(0) that means an exact match, so in the practice it will internally replace every match into the request string with the respective value of the key contained in the formatted_parameters dictionary.

Here the final version of our code

[crayon-681dc5545c2be341995678/]

I hope this simple trick would be useful to you and remember that every day is a great opportunity to learn new things, Happy Coding !!

Category: Chronicles from the trenches, Data Engineering, Python

2 thoughts on “Python Lambda and Regex – A good team for replacing a string using dictionaries”

  1. Nichol Sturkie says:
    January 29, 2021 at 8:56 pm

    Music began playing as soon as I opened up this web page, so annoying!

    Reply
  2. james says:
    March 28, 2021 at 1:33 pm

    That is very cool — thanks for sharing

    Reply

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Search for articles

Recent Posts

  • Quick Guide: BigQuery Service Account Setup Using gcloud
  • The Art of Data Modeling in AI times
  • Getting Started with Snowflake’s Snowpipe for Data Ingestion on Azure

Categories

  • Airflow (1)
  • Azure (6)
  • Azure DevOps (2)
  • Bash script (1)
  • Blog (1)
  • Cassandra (3)
  • Chronicles from the trenches (26)
  • Data Architecture (3)
  • Data Engineering (11)
  • DB optimization (2)
  • Events (2)
  • GIT (1)
  • MySQL (1)
  • Python (7)
  • Snowflake (3)
  • SQL Saturday (1)
  • SSIS (2)
  • T-SQL (5)
  • Uncategorized (2)

Archives

  • May 2025 (1)
  • March 2025 (1)
  • January 2025 (2)
  • October 2024 (1)
  • July 2024 (1)
  • May 2024 (1)
  • December 2023 (1)
  • November 2023 (1)
  • August 2023 (1)
  • June 2023 (1)
  • December 2022 (1)
  • November 2022 (1)
  • July 2022 (1)
  • March 2022 (1)
  • September 2021 (1)
  • May 2021 (1)
  • March 2021 (1)
  • February 2021 (3)
  • December 2020 (1)
  • October 2020 (3)
  • September 2020 (1)
  • August 2020 (1)
  • January 2020 (1)
  • August 2019 (1)
  • July 2019 (1)
  • June 2019 (1)
  • May 2019 (1)
  • April 2019 (1)
  • March 2019 (1)
  • November 2018 (3)
  • October 2018 (1)
  • September 2018 (1)
  • August 2018 (2)
© 2025 geohernandez | Powered by Minimalist Blog WordPress Theme