Of course! Let’s break down Method 2 (Using Regular Expressions) step by step in a way that’s easy to understand.
A regular expression (regex) is a pattern-matching technique used to find and manipulate text. Python provides a built-in module called re that allows us to use regex.
import re # Importing the regular expressions module
def extract_response(response: str):
# Use regex to find everything AFTER '</think>'
match = re.search(r"</think>\s*(.*)", response, re.DOTALL)
# If there's a match, return only the actual response (after '</think>')
return match.group(1).strip() if match else response
Let’s break this pattern down:
</think> – This literally looks for the marker </think>.
\s* – This means “match any whitespace characters (spaces, newlines, tabs) after </think>”. The * means “match zero or more times”.
(.*) – The parentheses capture everything after </think>, including newlines.
. means “any character”.
* means “zero or more occurrences of any character”.
The re.DOTALL flag ensures that . also matches newlines, allowing multi-line responses.
response = """This is the reasoning explanation.
It explains the logic before providing an answer.
</think> Here is the actual response."""
When we run:
clean_response = extract_response(response)
Output:
Here is the actual response.
✅ The reasoning part is removed!
If there’s no </think> in the response, the function will return the original response without any changes.
response_without_think = "This is a response without the think marker."
print(extract_response(response_without_think))
Output:
This is a response without the think marker.
✅ This ensures the function won’t break if the marker is missing.
- If the text format changes slightly (e.g., extra spaces, multiple lines), regex can handle it more flexibly.
split("</think>")[-1] works but doesn’t allow fine control over whitespace and newlines.
If you’re new to Python, this regex approach is a powerful and flexible way to extract text from structured responses. If you’re unsure about regex, the split method is a simpler alternative.
Let me know if you need more explanation! 🚀