PHP Strings & Patterns
PHP Regular Expression পর্ব-২
PHP Regular Expression দ্বিতীয় পর্বে আপনাকে স্বাগতম। এই পর্বে আমরা Regular Expressions এর Quantifier, Assertions এবং Sub Pattern Modifier কি এবং কিভাবে কাজ করে তার বিস্তারিত জানব। চলুন শুরু করা যাক :
PHP Regular Expression এ Quantifiers
String এর মধ্যে Pattern Matching করার সময় সেটা কতবার করবে আরো সহজ ভাবে বলা যায় repeated যেকোনো Matching এর জন্য PHP Regular Expression এ যেই special character গুলো ব্যবহৃত হয় , সে গুলোকে বলা হয় Quantifiers . নিম্নে Quantifier গুলোর list দেওয়া হলো :
Quantifier Name | Description |
---|---|
n* | শুন্য অথবা একাধিক বার Search করার জন্য। |
n+ | এক অথবা একাধিক বার Search করার জন্য। |
n? | শূন্য (নেই) অথবা atleast একটি আছে কিনা তা Search করার জন্য |
{n} | exact number Search করার জন্য |
{n,} | সর্বনিম্ন সংখ্যক number Search করার জন্য |
{n,m} | দুটি সংখ্যার মধ্যবতী যেকোনো সংখ্যক number Search করার জন্য |
n* দিয়ে শুন্য অথবা একাধিক বার Search
<?php /*** 4 x and 4 z chars ***/ $string = "xxxxzzzz"; /*** greedy regex ***/ preg_match_all("(x*)",$string,$matches); /*** results ***/ print_r($matches); ?>
Output
Array ( [0] => Array ( [0] => xxxx [1] => [2] => [3] => [4] => [5] => ) )
Example #2
<?php // Sample strings with prefixes and optional digits $strings = [ "item_", "product123", "category_456", "code_789", "invalidString", ]; // Match strings with a prefix followed by zero or more digits foreach ($strings as $str) { if (preg_match('/\w*\d*/', $str)) { echo "Match: $str\n"; } else { echo "No match: $str\n"; } } ?>
Output:
Match: item_ Match: product123 Match: category_456 Match: code_789 No match: invalidString
Example #3
<?php // Sample list of words $words = [ "apples", "cats", "dogs", "houses", "keys", ]; // Match words ending with 's' foreach ($words as $word) { if (preg_match('/.*s$/', $word)) { echo "Match: $word\n"; } else { echo "No match: $word\n"; } } ?>
Output
Match: apples Match: cats Match: dogs Match: houses Match: keys
Example #4
<?php
// Sample log entries
$logEntries = [
"[2023-01-01 12:30:45] INFO: Application started",
"[2023-01-01 13:45:20] WARNING: Database connection failed",
"[2023-01-02 08:00:01] ERROR: File not found: file123.txt",
"[2023-01-02 09:15:30] INFO: User logged in: john_doe",
"[2023-01-03 14:30:15] DEBUG: API request: /api/data",
];
// Regular expression to capture timestamp, log level, and message
$pattern = '/\[(.*?)\] (\w+): (.*)/';
// Process each log entry
foreach ($logEntries as $logEntry) {
if (preg_match($pattern, $logEntry, $matches)) {
// Extracted components
$timestamp = $matches[1];
$logLevel = $matches[2];
$message = $matches[3];
// Print the result
echo "Timestamp: $timestamp\n";
echo "Log Level: $logLevel\n";
echo "Message: $message\n";
echo "---\n";
} else {
echo "No match: $logEntry\n";
}
}
?>
Output
Timestamp: 2023-01-01 12:30:45
Log Level: INFO
Message: Application started
---
Timestamp: 2023-01-01 13:45:20
Log Level: WARNING
Message: Database connection failed
---
Timestamp: 2023-01-02 08:00:01
Log Level: ERROR
Message: File not found: file123.txt
---
Timestamp: 2023-01-02 09:15:30
Log Level: INFO
Message: User logged in: john_doe
---
Timestamp: 2023-01-03 14:30:15
Log Level: DEBUG
Message: API request: /api/data
---
n+ দিয়ে এক অথবা একাধিক বার Search করার জন্য।
<?php
/*** 5 x and 3 z chars ***/
$string = "xxzxzzxx";
/*** greedy regex ***/
preg_match_all("(x+)",$string,$matches);
/*** results ***/
print_r($matches);
?>
Output
Array ( [0] => Array ( [0] => xx [1] => x [2] => x ) )
Example #2
<?php
// Sample phone numbers in various formats
$phoneNumbers = [
"+1 (555) 123-4567",
"+44 20 7946 0958",
"+81 3-1234-5678",
"+49 (0) 30 1234 5678",
"555-7890",
];
// Regular expression to capture country code, area code, and local number
$pattern = '/\+(\d+)[^\d]*?(\d+)[^\d]*?(\d+)/';
// Process each phone number
foreach ($phoneNumbers as $phoneNumber) {
if (preg_match($pattern, $phoneNumber, $matches)) {
// Extracted components
$countryCode = $matches[1];
$areaCode = $matches[2];
$localNumber = $matches[3];
// Print the result
echo "Phone Number: $phoneNumber\n";
echo "Country Code: +$countryCode\n";
echo "Area Code: $areaCode\n";
echo "Local Number: $localNumber\n";
echo "---\n";
} else {
echo "No match: $phoneNumber\n";
}
}
?>
Output:
Phone Number: +1 (555) 123-4567
Country Code: +1
Area Code: 555
Local Number: 1234567
---
Phone Number: +44 20 7946 0958
Country Code: +44
Area Code: 20
Local Number: 79460958
---
Phone Number: +81 3-1234-5678
Country Code: +81
Area Code: 3
Local Number: 12345678
---
Phone Number: +49 (0) 30 1234 5678
Country Code: +49
Area Code: 30
Local Number: 12345678
---
No match: 555-7890
Example #3
<?php
// Sample text with email addresses
$text = "Contact us at info@example.com or support+john@example.net for assistance.";
// Regular expression to capture email addresses with a plus sign in the local part
$pattern = '/\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b/';
// Process the text
if (preg_match_all($pattern, $text, $matches)) {
// Extracted email addresses
$emailAddresses = $matches[0];
// Print the result
foreach ($emailAddresses as $email) {
echo "Email Address: $email\n";
}
} else {
echo "No email addresses found in the text.";
}
?>
Output
Email Address: info@example.com
Email Address: support+john@example.net
n? দিয়ে atleast একটি আছে কিনা তা Search করার জন্য
<?php
/*** 5 x and 3 z chars ***/
$string = "xxzxzzxx";
/*** greedy regex ***/
echo preg_match("(p+?)",$string,$matches)?"P is Found":"P is not found";
echo "<br>";
echo preg_match("(x+?)",$string,$matches)?"x is Found":"x is not found";
?>
Output
P is not found x is Found
Example #2
<?php // Sample HTML content $htmlContent = '<p>This is <b>bold</b> and <i>italic</i> text.</p>'; // Regular expression to capture content within HTML tags $pattern = '/<.*?>(.*?)<\/.*?>/'; // Process the HTML content if (preg_match_all($pattern, $htmlContent, $matches)) { // Extracted content within tags $tagContents = $matches[1]; // Print the result foreach ($tagContents as $content) { echo "Content within tags: $content\n"; } } else { echo "No HTML tags found in the content."; } ?>
Output
Content within tags: This is Content within tags: bold Content within tags: italic
Example #3
<?php // Sample text document with items and tags $textDocument = " Item 1: apple, orange, banana Item 2: banana, grape, kiwi, peach Item 3: apple, kiwi, mango "; // Regular expression to capture items and tags $pattern = '/Item (\d+): (.*?)(?=\n\n|$)/s'; // Process the text document if (preg_match_all($pattern, $textDocument, $matches, PREG_SET_ORDER)) { // Extracted items and tags foreach ($matches as $match) { $itemNumber = $match[1]; $tags = explode(', ', $match[2]); // Print the result echo "Item $itemNumber:\n"; foreach ($tags as $tag) { echo " - $tag\n"; } echo "---\n"; } } else { echo "No matches found in the text document.\n"; } ?>
Output:
Item 1: - apple - orange - banana --- Item 2: - banana - grape - kiwi - peach --- Item 3: - apple - kiwi - mango ---
Example #4
<?php // Sample text document with book information $textDocument = " Book 1: Title: The Great Gatsby Author: F. Scott Fitzgerald Publication Year: 1925 Book 2: Title: To Kill a Mockingbird, Author: Harper Lee, Year: 1960 Book 3: Title: 1984 - Author: George Orwell (Published: 1949) "; // Regular expression to capture book information $pattern = '/Book (\d+):.*?Title: (.*?)(?:Author: (.*?))?(?:Publication Year: (\d+))?(?=\n\n|$)/s'; // Process the text document if (preg_match_all($pattern, $textDocument, $matches, PREG_SET_ORDER)) { // Extracted book information foreach ($matches as $match) { $bookNumber = $match[1]; $title = $match[2]; $author = isset($match[3]) ? $match[3] : 'Unknown'; $publicationYear = isset($match[4]) ? $match[4] : 'Unknown'; // Print the result echo "Book $bookNumber:\n"; echo " - Title: $title\n"; echo " - Author: $author\n"; echo " - Publication Year: $publicationYear\n"; echo "---\n"; } } else { echo "No matches found in the text document.\n"; } ?>
Output
Book 1: - Title: The Great Gatsby - Author: F. Scott Fitzgerald - Publication Year: 1925 --- Book 2: - Title: To Kill a Mockingbird - Author: Harper Lee - Publication Year: 1960 --- Book 3: - Title: 1984 - Author: George Orwell - Publication Year: 1949 ---
{n} দিয়ে exact number Search
<?php // create a string $string = 'PHP123'; // look for a match echo preg_match("/PHP[0-9]{3}/", $string, $matches)?"Yes, there are three Decimal Digits after PHP.":"No, there are not three Decimal Digits after PHP."; ?>
output
Yes, there are three Decimal Digits after PHP.
Example #2
<?php // Get the mobile number from the form submission $mobileNumber = "01788223344"; // Define the regular expression pattern $pattern = '/^(?:\+?88)?(017|018)\d{8}$/'; // Perform the match if (preg_match($pattern, $mobileNumber)) { echo "Mobile number is valid: $mobileNumber"; } else { echo "Invalid mobile number. Please enter a valid number."; } ?>
Output
{n,} দিয়ে সর্বনিম্ন সংখ্যক number Search
<?php // create a string $string = 'PHP123'; // look for a match echo preg_match("/PHP[0-9]{2,}/", $string, $matches)?"Yes, there are tow to more Decimal Digits after PHP.":"No, there are not Two to more Decimal Digits after PHP."; ?>
Output
Yes, there are tow to more Decimal Digits after PHP.
Example #2
<?php // Sample list of passwords $passwords = [ "Pass123", "SecurePW567", "WeakPW", // Invalid: Less than 8 characters "VeryStrongPassword!123", // Valid: Meets the minimum requirement ]; // Regular expression to validate and extract passwords (minimum 8 characters) $pattern = '/^.{8,}$/'; // Process each password foreach ($passwords as $password) { if (preg_match($pattern, $password)) { echo "Valid Password: $password\n"; } else { echo "Invalid Password: $password\n"; } } ?>
Output:
Invalid Password: Pass123 Valid Password: SecurePW567 Invalid Password: WeakPW Valid Password: VeryStrongPassword!123
{n,m} দিয়ে দুটি সংখ্যার মধ্যবতী যেকোনো সংখ্যক number Search
<?php // create a string $string = 'PHP123'; // look for a match echo preg_match("/PHP[0-9]{2,3}/", $string, $matches)?"Yes, here there are numbers between two and five after PHP.":"No,There there are no numbers between two and five after PHP."; ?>
Output
Yes, here there are numbers between two and five after PHP.
Example #2
<?php // Sample list of usernames $usernames = [ "john_doe123", "aliceSmith", "usr123", // Invalid: Less than 4 characters "thisIsAnExtremelyLongUsername123", // Invalid: Exceeds 15 characters "user!name", // Invalid: Contains non-alphanumeric characters ]; // Regular expression to validate and extract usernames (between 4 and 15 characters, alphanumeric) $pattern = '/^[A-Za-z0-9]{4,15}$/'; // Process each username foreach ($usernames as $username) { if (preg_match($pattern, $username)) { echo "Valid Username: $username\n"; } else { echo "Invalid Username: $username\n"; } } ?>
Output
Valid Username: john_doe123 Valid Username: aliceSmith Invalid Username: usr123 Invalid Username: thisIsAnExtremelyLongUsername123 Invalid Username: user!name
PHP Regular Expression এ Pattern Modifiers
PHP Regular Expression এ প্যাটার্ন এ Search এর ধরণ পরিবর্তনের জন্য Forward Slash এর পর কিছু special Character যেমন : /i, /m ইত্যাদি ব্যবহৃত হয়। এগুলোকে বলা হয় Pattern Modifiers. নিম্নে সবগুলো Modifier এর list দেওয়া হলো :
Pattern Modifier Name | Description |
---|---|
i | Case Insensitive Search করবে। |
m | একাধিক লাইনে Search করার জন্য ব্যবহৃত হয়। |
x | comments এবং white space ও search এর অন্তর্গত করে। |
U | Pattern কে ungreedy সার্চ এর জন্য ব্যবহার করে। |
u | Pattern কে UTF-8 বুঝানোর ব্যবহার হয়। |
/i দিয়ে Case Insensitive Search
<?php // create a string $string = 'abcdefghijklmnopqrstuvwxyz0123456789'; // try to match our pattern if(preg_match("/^ABC/i", $string)) { // echo this is it matches echo 'The string begins with abc'; } else { // if not match is found echo this line echo 'No match found'; } ?>
Output
The string begins with abc
লক্ষ্য করুন , আমাদের প্যাটার্নটি Uppercase এবং String টি lowercase হওয়ার পর ও /i ব্যবহারের ফলে Search এ কোনো সমস্যা হয় নাই।
Example #2
<?php $products = array("iPhone X", "Samsung Galaxy S10", "Google Pixel 3", "Sony Xperia Z3","Google Pixel 4"); $searchTerm = "Pixel"; // Assuming the search term is obtained from user input // Escape special characters in the search term to prevent regex issues $escapedSearchTerm = preg_quote($searchTerm, '/'); // Use case-insensitive regex search with preg_grep $matches = preg_grep('/' . $escapedSearchTerm . '/i', $products); // Display the matching products echo "Matching Products:\n"; foreach ($matches as $match) { echo $match . "\n"; } ?>
Output
Matching Products: Google Pixel 3 Google Pixel 4
Example #3
// Simulated array of existing usernames in your database $existingUsernames = array("JohnDoe", "Alice123", "BobSmith", "Eve"); // Assume the new username is obtained from user input $newUsername = $_POST['username']; // Escape special characters in the new username $escapedNewUsername = preg_quote($newUsername, '/'); // Check if the new username already exists (case-insensitive) $matches = preg_grep('/^' . $escapedNewUsername . '$/i', $existingUsernames); if (!empty($matches)) { echo "Username already exists. Please choose a different one."; } else { echo "Username is available. You can proceed with registration."; }
/m দিয়ে একাধিক লাইনে Search
<?php // create a string $string = 'Bangladesh'."\n".'India'."\n".'Pakistan'."\n".'Srilanka'."\n"; // look for a match if(preg_match("/^Pakistan/im", $string)) { echo 'Pattern Found'; } else { echo 'Pattern not found'; } ?>
Output
Pattern Found
তবে আপনি যদি প্যাটার্ন শেষে /m না দেন, Pattern Not Found রেজাল্ট আসবে। অর্থাৎ Pakistan শব্দটি new line এ হওয়ায় Pattern Matching হবেনা। নিচের উদাহরণ দেখুন :
<?php // create a string $string = 'Bangladesh'."\n".'India'."\n".'Pakistan'."\n".'Srilanka'."\n"; // look for a match if(preg_match("/^Pakistan/i", $string)) { echo 'Pattern Found'; } else { echo 'Pattern not found'; } ?>
Output
Pattern not found
Example #2
আসুন একটি টেক্সট ফাইল থেকে ডেটা এক্সট্র্যাক্ট এবং ম্যানিপুলেশন এর একটি উদাহরণ বিবেচনা করা যাক। ধরুন আপনার কাছে ইউজারের কার্যকলাপ সম্পর্কে তথ্য সম্বলিত এন্ট্রি সহ একটি লগ ফাইল রয়েছে এবং আপনি নির্দিষ্ট ডেটা যেমন timestamps, usernames এবং সম্পাদিত ক্রিয়াকলাপগুলি বের করতে এবং বিশ্লেষণ করতে চান৷ অতিরিক্তভাবে, আপনি ক্যাপিটালাইজেশনের ভেরিয়েশন্স গুলো পরিচালনা করার জন্য একাধিক লাইনে সার্চ কে case-insensitive করতে চান৷
ধরুন আপনার লগ ফাইলের এন্ট্রিগুলি এইরকম:
[2023-01-15 12:30:45] User 'Alice123' logged in. [2023-01-15 13:15:20] User 'BobSmith' performed an action. [2023-01-15 14:05:10] User 'eve' updated their profile. [2023-01-15 15:40:55] User 'JohnDoe' logged in.
এখন, আপনি timestamps, usernames এবং সম্পাদিত actions গুলির মতো তথ্য বের করতে চান। আপনি username এর জন্য user-provided search term এর উপর ভিত্তি করে এন্ট্রি গুলো একাধিক লাইনে অনুসন্ধান করতে চান, ক্ষেত্রে নির্বিশেষে।
এখানে একটি পিএইচপি স্ক্রিপ্ট যা i modifier এর সাথে m modifier regular expressions ব্যবহার করে এটি প্রদর্শন করে
$logFileContent = file_get_contents('path/to/your/logfile.txt'); // Read the log file content $searchTerm = $_GET['username']; // User-provided search term // Escape special characters in the search term $escapedSearchTerm = preg_quote($searchTerm, '/'); // Define the regex pattern to extract relevant information $pattern = '/^\[(.*?)\] User \'(' . $escapedSearchTerm . ')\' (.+)$/im'; // Use preg_match_all to find matches in the log file content preg_match_all($pattern, $logFileContent, $matches, PREG_SET_ORDER); // Display the extracted information echo "Search Results:<br>"; foreach ($matches as $match) { $timestamp = $match[1]; $username = $match[2]; $action = $match[3]; echo "Timestamp: $timestamp, Username: $username, Action: $action<br>"; }
Example #3
চলুন আরো একটি উদাহরণ দেখা যাক , তার জন্য প্রথমে নিম্নোক্ত টেক্সট গুলো দিয়ে একটি logFile.log ফাইলটি বানিয়ে নিন :
[2023-01-20 08:45:12] Error: Division by zero File: /path/to/file.php Line: 15 Trace: at divide (/path/to/file.php:15) at process (/path/to/file.php:10) at main (/path/to/file.php:5) [2023-01-20 09:30:05] Warning: Undefined variable $var File: /path/to/another_file.php Line: 8 Trace: at process (/path/to/another_file.php:8) at main (/path/to/another_file.php:3)
এবার নিম্নোক্ত m modifier ব্যবহার করে তৈরি করা RegEx কোড দিয়ে logFile কে একাধিক লাইনে রিড করুন :
<?php $logFileContent = file_get_contents('logfile.log'); // Read the log file content // Define the regex pattern to extract error information $pattern = '/^\[(.*?)\] (.*?): (.*?)$/m'; // Use preg_match_all to find all matches in the log file content preg_match_all($pattern, $logFileContent, $matches, PREG_SET_ORDER); // Display the extracted error information echo "Error Information:<br>"; foreach ($matches as $match) { $timestamp = $match[1]; $errorType = $match[2]; $errorMessage = $match[3]; echo "Timestamp: $timestamp\n"; echo "Error Type: $errorType\n"; echo "Error Message: $errorMessage\n"; echo "<hr>"; }
/x দিয়ে comments এবং white space এর মধ্যে search
<?php // create a string $string = 'Bangladesh'."\n".'Pakistan'."\n".'India'."\n".'Nepal'."\n"; // create our regex using comments and store the regex // in a variable to be used with preg_match $regex =' / # opening double quote ^ # caret means beginning of the string India # the pattern to match /imx'; // look for a match if(preg_match($regex, $string)) { echo 'Pattern Found'; } else { echo 'Pattern not found'; } ?>
Output
Pattern Found
তবে আপনি যদি প্যাটার্ন শেষে /x না দেন, Pattern Not Found রেজাল্ট আসবে। অর্থাৎ India শব্দটি Comment এ হওয়ায় Pattern Matching হবেনা। নিচের উদাহরণ দেখুন :
<?php // create a string $string = 'Bangladesh'."\n".'Pakistan'."\n".'India'."\n".'Nepal'."\n"; // create our regex using comments and store the regex // in a variable to be used with preg_match $regex =' / # opening double quote ^ # caret means beginning of the string India # the pattern to match /im'; // look for a match if(preg_match($regex, $string)) { echo 'Pattern Found'; } else { echo 'Pattern not found'; } ?>
Output
Pattern not found
Example #3
<?php $dateString = "2023-12-25"; // Define the regex pattern with the /x modifier $pattern = '/ ^ # Start of the string (\d{4}) # Capture four digits for the year - # Dash separator for the year and month (\d{2}) # Capture two digits for the month - # Dash separator for the month and day (\d{2}) # Capture two digits for the day $ # End of the string /x'; // Use preg_match to find the match in the date string if (preg_match($pattern, $dateString, $matches)) { $year = $matches[1]; $month = $matches[2]; $day = $matches[3]; echo "Year: $year\n"; echo "Month: $month\n"; echo "Day: $day"; } else { echo "Invalid date format."; }
/U দিয়ে Pattern কে ungreedy সার্চ (অর্থাৎ atleast একটা Match করলেই হবে।) এই রকম search
<?php /*** a simple string ***/ $string = 'foobar foo--bar fubar'; /*** try to match the pattern ***/ if(preg_match("/foo(.*)bar/U", $string)){ echo 'Match found'; } else{ echo 'No match found'; } ?>
Example #2
<?php $htmlContent = ' <p>This is the first paragraph.</p> <p>Second paragraph with <a href="#">a link</a>.</p> <p>Third paragraph with <strong>strong text</strong>.</p> '; // Define the regex pattern with the /U modifier $pattern = '/<p>(.*?)<\/p>/Us'; // Use preg_match to find the first paragraph content if (preg_match($pattern, $htmlContent, $matches)) { $paragraphContent = $matches[1]; echo "First Paragraph Content: $paragraphContent"; } else { echo "No matching paragraph found."; }
/u দিয়ে UTF-8 বুঝানোর ব্যবহার হয়।
PHP-তে, u modifier প্যাটার্ন এবং subject স্ট্রিংগুলিকে UTF-8 এনকোড করার জন্য ব্যবহার করা হয়। UTF-8 এর মতো মাল্টিবাইট অক্ষর এনকোডিংগুলির সাথে কাজ করার সময় এটি বিশেষভাবে কার্যকর। এখানে u মডিফায়ারের ব্যবহার ব্যাখ্যা করার একটি উদাহরণ রয়েছে:
<?php $utf8String = 'Hello, মাহমুদ বিন মাসুদ !'; // Define a regex pattern to match each character individually $pattern = '/./u'; // Use preg_match_all to find all matches in the UTF-8 string preg_match_all($pattern, $utf8String, $matches); // Display the individual characters echo "Individual Characters:<\n"; foreach ($matches[0] as $character) { echo "$character\n"; }
Example #3
<?php $utf8Data = "Alice: 95\nBob: 87\nমুহিব্বুল্লাহ বিন মাসউদ : 92\nমাহমুদ বিন মাসউদ : 88\nমারয়াম বিনতে মাসউদ : 90"; // Define the regex pattern with the u modifier $pattern = '/^([^\n]+): (\d+)$/mu'; // Use preg_match_all to find all matches in the UTF-8 string preg_match_all($pattern, $utf8Data, $matches, PREG_SET_ORDER); // Initialize variables for calculating average score $totalScore = 0; $numberOfEntries = count($matches); // Display extracted information and calculate average score echo "Name and Score Information:\n"; foreach ($matches as $match) { $name = $match[1]; $score = (int)$match[2]; echo "Name: $name, Score: $score\n"; // Accumulate scores for calculating average $totalScore += $score; } // Calculate and display average score if ($numberOfEntries > 0) { $averageScore = $totalScore / $numberOfEntries; echo "Average Score: $averageScore"; } else { echo "No entries found."; }
PHP Regular Expression এ Point Based Assertions
একটি String এর ঠিক কোন Point থেকে প্যাটার্ন টি ম্যাচিং শুরু করবে তা নির্ধারণের জন্য PHP তে assertion character গুলো ব্যবহৃত হয়। নিম্নে সবগুলো assertion এর list দেওয়া হলো :
Point based assertions Name | Description |
---|---|
\b | সম্পূর্ণ স্বতন্ত্র word হিসেবে search করার জন্য ব্যবহৃত হয়। |
\B | সম্পূর্ণ স্বতন্ত্র word না হয়ে word একটা অংশ হিসেবে search করার জন্য ব্যবহৃত হয়। |
\A | String এর শুরু থেকে Search করার জন্য ব্যবহৃত হয়। (independent of multiline mode) |
\Z | String এর শেষের দিক থেকে Search করার জন্য ব্যবহৃত হয়। বা নতুন লাইনের শেষে। (independent of multiline mode) |
\z | String এর শেষের দিক থেকে Search করার জন্য ব্যবহৃত হয়। (independent of multiline mode) |
\G | string এর মধ্যের word গুলোর প্রথম থেকে ম্যাচিং Serch.অর্থাৎ, word এর মাঝে ম্যাচ করবেনা। |
\b দিয়ে সম্পূর্ণ স্বতন্ত্র word হিসেবে search
<?php /*** a simple string ***/ $string = 'Masud is staying at the lab.'; /*** here we will try match the string "lab" ***/ if(preg_match ("/\blab\b/i", $string)) { /*** if we get a match ***/ echo "Lab is a completely separate word"; } else { /*** if no match is found ***/ echo 'There is no separate word named Lab'; } ?>
Output
Lab is a completely separate word
আবার যদি আমরা stay ওয়ার্ডকে সার্চ করি, তাহলে প্যাটার্ন Match করবেনা। কারণ stay word টি সম্পূর্ণ seperate কোনো word নয়। নিচের উদাহরণটি দেখুন:
<?php /*** a simple string ***/ $string = 'Masud is staying at the lab.'; /*** here we will try match the string "lab" ***/ if(preg_match ("/\bstay\b/i", $string)) { /*** if we get a match ***/ echo "stay is a completely separate word"; } else { /*** if no match is found ***/ echo 'There is no Completely separate word named stay'; } ?>
Output
There is no Completely separate word named stay
Example #3
<?php // Sample text $text = "I have an apple, and I like to eat apples. The apple is a delicious fruit."; // Word to search for and replace $searchWord = "apple"; $replaceWord = "orange"; // Use \b to match whole words $pattern = "/\b" . preg_quote($searchWord, '/') . "\b/"; // Perform the replacement $newText = preg_replace($pattern, $replaceWord, $text); // Output the result echo "Original Text: $text\n"; echo "Modified Text: $newText\n"; ?>
Output
Original Text: I have an apple, and I like to eat apples. The apple is a delicious fruit. Modified Text: I have an orange, and I like to eat oranges. The orange is a delicious fruit.
\B দিয়ে word word একটা অংশ হিসেবে search
<?php /*** a simple string ***/ $string = 'Masud will available at 6:00 PM'; /*** here we will try match the string "lab" ***/ if(preg_match ("/lab\B/i", $string)) { /*** if we get a match ***/ echo "Lab is a part of word"; } else { /*** if no match is found ***/ echo 'Lab is a completely Seperate Word'; } ?>
Output
Lab is a part of word
আবার will যেহেতু সম্পূর্ণ আলাদা word , তাই wil এর বেলায় pattern match করবেনা। নিচের উদাহরণ টি দেখুন :
<?php /*** a simple string ***/ $string = 'Masud will available at 6:00 PM'; /*** here we will try match the string "lab" ***/ if(preg_match ("/will\B/i", $string)) { /*** if we get a match ***/ echo "will is a part of word"; } else { /*** if no match is found ***/ echo 'will not a part of Word'; } ?>
Output
will not a part of Word
Example #3
<?php function extractHashtags($text) { // Extract hashtags within larger words $pattern = "/\B#\w+\b/"; preg_match_all($pattern, $text, $matches); return $matches[0]; } // Sample text with hashtags $text = "Check out #PHP and #JavaScript. #Coding is fun!"; // Extract hashtags within larger words $hashtags = extractHashtags($text); // Output the result echo "Original Text: $text\n"; echo "Extracted Hashtags: " . implode(', ', $hashtags) . "\n"; ?>
Output
Original Text: Check out #PHP and #JavaScript. #Coding is fun! Extracted Hashtags: #PHP, #JavaScript, #Coding
Example #4
<?php function extractEmailAddresses($text) { // Extract email addresses within larger words $pattern = "/\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b/"; preg_match_all($pattern, $text, $matches); return $matches[0]; } // Sample text with email addresses $text = "Contact support@example.com for assistance. Email me at user@emailprovider.com."; // Extract email addresses within larger words $emailAddresses = extractEmailAddresses($text); // Output the result echo "Original Text: $text\n"; echo "Extracted Email Addresses: " . implode(', ', $emailAddresses) . "\n"; ?>
Output:
Original Text: Contact support@example.com for assistance. Email me at user@emailprovider.com. Extracted Email Addresses: support@example.com, user@emailprovider.com
\A দিয়ে String এর শুরু থেকে Search
<?php // create a string $string = 'abcdefghijklmnopqrstuvwxyz0123456789'; // try to match our pattern if(preg_match("/\Aabc/i", $string)) { // echo this is it matches echo 'The string begins with abc'; } else { // if not match is found echo this line echo 'No match found'; } ?>
Output
The string begins with abc
\z দিয়ে String এর শেষের দিক থেকে Search
<?php // create a string $string = 'abcdefghijklmnopqrstuvwxyz0123456789'; // try to match our pattern if(preg_match("/\89\z/i", $string)) { // echo this is it matches echo 'The string ends with 89'; } else { // if not match is found echo this line echo 'No match found'; } ?>
Output
The string ends with 89
\G string এর মধ্যের word গুলোর প্রথম থেকে ম্যাচিং
<?php $pattern = '#(match),#'; $subject = "match,match,match,match,not-match,match"; preg_match_all( $pattern, $subject, $matches ); //Will output match 5 times because it skips over not-match print_r($matches[0]); $pattern = '#(\Gmatch),#'; $subject = "match,match,match,match,not-match,match"; preg_match_all( $pattern, $subject, $matches ); //Will only output match 4 times because at not-match the chain is broken print_r($matches[0]); ?>
Output
Array ( [0] => match, [1] => match, [2] => match, [3] => match, [4] => match, ) Array ( [0] => match, [1] => match, [2] => match, [3] => match, )
Example #2
<?php
function extractKeyValuePairs($text)
{
// Extract key-value pairs
$pattern = "/(?:\G|^)([a-zA-Z0-9_]+)=([^&]+)&?/";
preg_match_all($pattern, $text, $matches, PREG_SET_ORDER);
$result = [];
foreach ($matches as $match) {
$result[$match[1]] = $match[2];
}
return $result;
}
// Sample text with key-value pairs
$text = "key1=value1&key2=value2&key3=value3";
// Extract key-value pairs
$keyValuePairs = extractKeyValuePairs($text);
// Output the result
echo "Original Text: $text\n";
echo "Extracted Key-Value Pairs: " . json_encode($keyValuePairs) . "\n";
?>
Output:
Original Text: key1=value1&key2=value2&key3=value3
Extracted Key-Value Pairs: {"key1":"value1","key2":"value2","key3":"value3"}
Subpattern Modifiers এবং Assertions
Modifier Name | Description |
---|---|
(?:) | ?: (Non-capturing Groups) : সিনট্যাক্স নির্দেশ করে যে গ্রুপটি back-referencing বা extraction জন্য string এর সাথে থাকতেও পারে বা নাও থাকতে পারে । |
(?=) | দুটি word এর প্রথম অংশ এর সাথে দ্বিতীয় অংশ যুক্ত কিনা তা check/Match করার জন্য ব্যবহৃত হয়। |
(?!) | দুটি word এর প্রথম অংশ এর সাথে দ্বিতীয় অংশ যুক্ত নয় তা check/Match করার জন্য ব্যবহৃত হয়। |
(?<=) | দুটি যুক্ত word এর নির্দিষ্ট একটি word অন্য আরেকটি word এর আগে কিনা তা check করার জন্য ব্যবহৃত হয়। |
( ? < ! ) | দুটি যুক্ত word এর নির্দিষ্ট একটি word অন্য আরেকটি word এর আগে নয়, তা check করার জন্য ব্যবহৃত হয়। |
(?:) এর বাস্তব উদাহরণ :
<?php // Get the mobile number from the form submission $mobileNumber = "+8801788223344"; // Define the regular expression pattern $pattern = '/^(?:\+?88)?(017|018)\d{8}$/'; // Perform the match if (preg_match($pattern, $mobileNumber)) { echo "Mobile number is valid: $mobileNumber"; } else { echo "Invalid mobile number. Please enter a valid number."; } ?>
Example #2
<?php // Sample URLs $urls = [ 'https://www.example.com/page1', 'http://subdomain.example.org/path/to/page2', 'ftp://ftp.example.net/file', 'invalid-url', ]; // Define the regular expression pattern $pattern = '/^(https?|ftp):\/\/([^\/]+)\/(.*)$/'; // Iterate through each URL foreach ($urls as $url) { // Perform the regular expression match if (preg_match($pattern, $url, $matches)) { // $matches[0] contains the entire matched string // $matches[1] contains the protocol (http, https, or ftp) // $matches[2] contains the domain name // $matches[3] contains the path echo "URL: $url\n"; echo "Protocol: {$matches[1]}\n"; echo "Domain: {$matches[2]}\n"; echo "Path: {$matches[3]}\n"; echo "-----------------\n"; } else { echo "Invalid URL: $url\n"; echo "-----------------\n"; } } ?>
Output:
URL: https://www.example.com/page1 Protocol: https Domain: www.example.com Path: page1 ----------------- URL: http://subdomain.example.org/path/to/page2 Protocol: http Domain: subdomain.example.org Path: path/to/page2 ----------------- URL: ftp://ftp.example.net/file Protocol: ftp Domain: ftp.example.net Path: file ----------------- Invalid URL: invalid-url -----------------
Some Use Cases of : Non-capturing Groups
1. User of (?:abc|def) Pattern
<?php // Sample text containing email addresses and phone numbers $text = "Contact us at support@example.com or call +1 (555) 123-4567. For sales, email sales@example.com or call +44 (20) 7123 1234."; // Define the regular expression pattern using a non-capturing group $pattern = '/(?:\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b|\+\d{1,4}\s?\(\d{1,}\)\s?\d{1,}[-\s]?\d{1,})/'; // Perform the regular expression match if (preg_match_all($pattern, $text, $matches)) { // $matches[0] contains an array of all matched strings echo "Matched Email Addresses and Phone Numbers:\n"; foreach ($matches[0] as $match) { echo $match . "\n"; } } else { echo "No matches found."; } ?>
Output:
Matched Email Addresses and Phone Numbers: support@example.com +1 (555) 123-4567 sales@example.com +44 (20) 7123 1234
Another Example
<?php // Sample text containing dates in different formats $text = "The event is scheduled for 05/20/2023 or 2023-05-20. Please mark your calendar."; // Define the regular expression pattern using a non-capturing group $pattern = '/(?:\b\d{2}\/\d{2}\/\d{4}\b|\b\d{4}-\d{2}-\d{2}\b)/'; /* \b\d{2}\/\d{2}\/\d{4}\b: Pattern for matching dates in "MM/DD/YYYY" format. |: OR operator. \b\d{4}-\d{2}-\d{2}\b: Pattern for matching dates in "YYYY-MM-DD" format. */ // Perform the regular expression match if (preg_match_all($pattern, $text, $matches)) { // $matches[0] contains an array of all matched strings echo "Matched Dates:\n"; foreach ($matches[0] as $match) { echo $match . "\n"; } } else { echo "No matches found."; } ?>
Output
Matched Dates: 05/20/2023 2023-05-20
2. User of (?:\d{3}){2} Pattern
<?php // Sample text containing pairs of postal codes $text = "The delivery is scheduled for 12345-6789 or 98765-4321. Please provide your ZIP code for accurate shipping."; // Define the regular expression pattern using a non-capturing group $pattern = '/(?:\b\d{5}-\d{4}\b|\b\d{5}\b)/'; /* \b\d{5}-\d{4}\b: Pattern for matching five-digit ZIP codes followed by a dash and four digits. |: OR operator. \b\d{5}\b: Pattern for matching standalone five-digit ZIP codes. */ // Perform the regular expression match if (preg_match_all($pattern, $text, $matches)) { // $matches[0] contains an array of all matched strings echo "Matched ZIP Codes:\n"; foreach ($matches[0] as $match) { echo $match . "\n"; } } else { echo "No matches found."; } ?>
Output
Matched ZIP Codes: 12345-6789 98765-4321
Another Example:
<?php // Sample text containing pairs of alphanumeric codes $text = "The product code is ABC123-XYZ456. Another code is DEF789-GHI987."; // Define the regular expression pattern using a non-capturing group $pattern = '/(?:[A-Za-z]+\d+-[A-Za-z]+\d+)/'; // Perform the regular expression match if (preg_match_all($pattern, $text, $matches)) { // $matches[0] contains an array of all matched strings echo "Matched Alphanumeric Codes:\n"; foreach ($matches[0] as $match) { echo $match . "\n"; } } else { echo "No matches found."; } ?>
Output
Matched Alphanumeric Codes: ABC123-XYZ456 DEF789-GHI987
User of (?:Mr|Ms|Mrs) Pattern
<?php // Sample text containing salutations and names $text = "Mr. Smith Ms. Johnson Mrs. Davis Mr. Brown Ms. Anderson"; // Define the regular expression pattern // We'll use the pattern (?:Mr|Ms|Mrs)\.?\s[A-Za-z]+ to match salutations like "Mr.," "Ms.," or "Mrs.," followed by a name. $pattern = '/(?:Mr|Ms|Mrs)\.?\s[A-Za-z]+/'; // Perform the regular expression match if (preg_match_all($pattern, $text, $matches)) { // $matches[0] contains an array of all matched strings echo "Matched Salutations and Names following the pattern (?:Mr|Ms|Mrs)\\.?\\s[A-Za-z]+:\n"; foreach ($matches[0] as $match) { echo $match . "\n"; } } else { echo "No matches found."; } ?>
(?=) দিয়ে দুটি যুক্ত word এর প্রথম অংশ এর সাথে দ্বিতীয় অংশকে check/Match
<?php /*** a simple string ***/ $string = 'I live in the whitehouse'; /*** try to match white followed by house ***/ if(preg_match("/white(?=house)/i", $string)) { /*** if we find the word white, followed by house ***/ echo 'Found a match'; } else { /*** if no match is found ***/ echo 'No match found'; } ?>
Output
Found a match
Example #2
<?php // Sample date strings $dateStrings = [ '2023-12-26', '1998-05-15', 'invalid-date', '2023-13-01', ]; // Define the regular expression pattern for validating dates in YYYY-MM-DD format $pattern = '/^(?=\d{4}-\d{2}-\d{2}$)(\d{4})-(\d{2})-(\d{2})$/'; // Iterate through each date string foreach ($dateStrings as $date) { // Perform the regular expression match if (preg_match($pattern, $date, $matches)) { // $matches[0] contains the entire matched string // $matches[1] contains the year // $matches[2] contains the month // $matches[3] contains the day echo "Date String: $date\n"; echo "Year: {$matches[1]}\n"; echo "Month: {$matches[2]}\n"; echo "Day: {$matches[3]}\n"; echo "-----------------\n"; } else { echo "Invalid Date String: $date\n"; echo "-----------------\n"; } } ?> <h4>Output:</h4> <pre> Date String: 2023-12-26 Year: 2023 Month: 12 Day: 26 ----------------- Date String: 1998-05-15 Year: 1998 Month: 05 Day: 15 ----------------- Invalid Date String: invalid-date ----------------- Date String: 2023-13-01 Year: 2023 Month: 13 Day: 01 ----------------- </pre>
Example #3
<?php /* We'll create a PHP script that validates passwords based on certain criteria, such as minimum length, the presence of uppercase and lowercase letters, and at least one special character. */ // Sample passwords $passwords = [ 'StrongP@ssword123', 'weakpassword', 'AnotherWeak123', 'NoSpecialChar123', ]; // Define the regular expression pattern for password validation $pattern = '/^(?=.*[a-z])(?=.*[A-Z])(?=.*\d)(?=.*[@$!%*?&])[A-Za-z\d@$!%*?&]{8,}$/'; // Iterate through each password foreach ($passwords as $password) { // Perform the regular expression match if (preg_match($pattern, $password)) { echo "Strong Password: $password\n"; } else { echo "Weak Password: $password\n"; } echo "-----------------\n"; } ?>
Output:
Strong Password: StrongP@ssword123 ----------------- Weak Password: weakpassword ----------------- Weak Password: AnotherWeak123 ----------------- Weak Password: NoSpecialChar123 -----------------
(?!) দিয়ে দুটি word এর প্রথম অংশ এর সাথে দ্বিতীয় অংশ যুক্ত নয় তা check/Match করা
<?php /*** a simple string ***/ $string = 'I live in the white house'; /*** try to match white not followed by house ***/ if(preg_match("/white(?!house)/i", $string)) { /*** if we find the word white, not followed by house ***/ echo 'Found a match'; } else { /*** if no match is found ***/ echo 'No match found'; } ?>
Output
Found a match
Example #2
<?php // Updated sample passwords $passwords = [ 'StrongP@ssword123', 'weakpassword', 'AnotherWeak123', 'NoSpecialChar123', 'Complex&SecurePass12', ]; // Define the regular expression pattern for password validation $pattern = '/^(?=.*[a-z])(?=.*[A-Z])(?=.*\d)(?=.*[@$!%*?&])(?!.*(password|123|admin))[\w@$!%*?&]{8,}$/'; /* (?!.*(password|123|admin)): Negative lookahead for disallowed patterns. It ensures that the password does not contain the strings "password," "123," or "admin." */ // Iterate through each password foreach ($passwords as $password) { // Perform the regular expression match if (preg_match($pattern, $password)) { echo "Strong Password: $password\n"; } else { echo "Weak Password: $password\n"; } echo "-----------------\n"; } ?>
Output
Weak Password: StrongP@ssword123 ----------------- Weak Password: weakpassword ----------------- Weak Password: AnotherWeak123 ----------------- Weak Password: NoSpecialChar123 ----------------- Strong Password: Complex&SecurePass12 -----------------
Example #3
<?php // Sample file names $fileNames = [ 'document.txt', 'image.jpg', 'presentation.pptx', 'badfile.exe', 'hackfile.bat', 'archive.zip', 'invalid-file.invalid', ]; // Define the regular expression pattern for file name validation $pattern = '/^[a-zA-Z0-9_-]+\.(?!exe|bat|invalid$)([a-zA-Z]{1,4})$/'; /* (?!exe|bat|invalid$): Negative lookahead for disallowed file extensions. It ensures that the file extension is not "exe," "bat," or "invalid" at the end of the string. */ // Iterate through each file name foreach ($fileNames as $fileName) { // Perform the regular expression match if (preg_match($pattern, $fileName, $matches)) { // $matches[0] contains the entire matched string // $matches[1] contains the file extension echo "File Name: $fileName\n"; echo "File Extension: {$matches[1]}\n"; echo "-----------------\n"; } else { echo "Invalid File Name: $fileName\n"; echo "-----------------\n"; } } ?>
Output
File Name: document.txt File Extension: txt ----------------- File Name: image.jpg File Extension: jpg ----------------- File Name: presentation.pptx File Extension: pptx ----------------- File Name: archive.zip File Extension: zip ----------------- Invalid File Name: invalid-file.invalid ----------------- Invalid File Name: badfile.exe ----------------- Invalid File Name: hackfile.bat -----------------
(?<=) দিয়ে দুটি যুক্ত word এর নির্দিষ্ট একটি word অন্য আরেকটি word এর আগে কিনা তা check করা
<?php /*** a simple string ***/ $string = 'I live in the whitehouse'; /*** try to match house preceded by white ***/ if(preg_match("/(?<=white)house/i", $string)) { /*** if we find the word white, not followed by house ***/ echo 'Found a match'; } else { /*** if no match is found ***/ echo 'No match found'; } ?>
Output
Found a match
Example #2
<?php // Sample date strings with a prefix $dateStrings = [ 'Date: 12/26/2023', 'Date: 05/15/1998', 'Invalid Date: 01/01/2023', 'Date: 13/01/2023', // Invalid date format ]; // Define the regular expression pattern for extracting dates with a positive lookbehind $pattern = '/(?<=Date: )(\d{2}\/\d{2}\/\d{4})/'; /* (?<=Date: ): Positive lookbehind for the prefix "Date: ". */ // Iterate through each date string foreach ($dateStrings as $dateString) { // Perform the regular expression match if (preg_match($pattern, $dateString, $matches)) { // $matches[0] contains the entire matched string // $matches[1] contains the date echo "Date String: $dateString\n"; echo "Extracted Date: {$matches[1]}\n"; echo "-----------------\n"; } else { echo "Invalid Date String: $dateString\n"; echo "-----------------\n"; } } ?>
Output:
Date String: Date: 12/26/2023 Extracted Date: 12/26/2023 ----------------- Date String: Date: 05/15/1998 Extracted Date: 05/15/1998 ----------------- Date String: Invalid Date: 01/01/2023 Extracted Date: 01/01/2023 ----------------- Date String: Date: 13/01/2023 Extracted Date: 13/01/2023 -----------------
Example #3
<?php // Sample date strings with a prefix $dateStrings = [ 'Date: 12/26/2023', 'Date: 05/15/1998', 'Invalid Date: 01/01/2023', 'Date: 13/01/2023', // Invalid date format ]; // Define the regular expression pattern for extracting dates with a positive lookbehind $pattern = '/(?<=Date: )(\d{2}\/\d{2}\/\d{4})/'; /* (?<=Date: ): Positive lookbehind for the prefix "Date: ". */ // Iterate through each date string foreach ($dateStrings as $dateString) { // Perform the regular expression match if (preg_match($pattern, $dateString, $matches)) { // $matches[0] contains the entire matched string // $matches[1] contains the date echo "Date String: $dateString\n"; echo "Extracted Date: {$matches[1]}\n"; echo "-----------------\n"; } else { echo "Invalid Date String: $dateString\n"; echo "-----------------\n"; } } ?>
Output:
Date String: Date: 12/26/2023 Extracted Date: 12/26/2023 ----------------- Date String: Date: 05/15/1998 Extracted Date: 05/15/1998 ----------------- Date String: Invalid Date: 01/01/2023 Extracted Date: 01/01/2023 ----------------- Date String: Date: 13/01/2023 Extracted Date: 13/01/2023 -----------------
( ? < ! ) ব্যবহার করে দুটি যুক্ত word এর নির্দিষ্ট একটি word অন্য আরেকটি word এর আগে নয়, তা check করা।
<?php /*** a simple string ***/ $string = 'I live in the white house'; /*** try to match house preceded by white ***/ if(preg_match("/(?<!white)house/i", $string)) { /*** if we find the word white, not followed by house ***/ echo 'Found a match'; } else { /*** if no match is found ***/ echo 'No match found'; } ?>
Output
Found a match
Example #2
<?php // Sample product codes $productCodes = [ 'PRD-12345', 'INV-67890', 'XYZ-98765', 'Invalid-54321', // Invalid prefix 'PRD-32456' ]; // Define the regular expression pattern for product code validation $pattern = '/^[A-Z]{3}-(?<!INV-|XYZ-)(\d{5})$/'; // Iterate through each product code foreach ($productCodes as $productCode) { // Perform the regular expression match if (preg_match($pattern, $productCode, $matches)) { // $matches[0] contains the entire matched string // $matches[1] contains the product number echo "Product Code: $productCode\n"; echo "Product Number: {$matches[1]}\n"; echo "-----------------\n"; } else { echo "Invalid Product Code: $productCode\n"; echo "-----------------\n"; } } ?>
Output
Product Code: PRD-12345 Product Number: 12345 ----------------- Invalid Product Code: INV-67890 ----------------- Invalid Product Code: XYZ-98765 ----------------- Invalid Product Code: Invalid-54321 ----------------- Product Code: PRD-32456 Product Number: 32456 -----------------