开发者

Insert separators into a string in regular intervals

开发者 https://www.devze.com 2023-01-13 06:58 出处:网络
I have the following string in php: $string = \'FEDCBA9876543210\'; The string can be have 2 or more (I mean more) hexadecimal characters

I have the following string in php:

$string = 'FEDCBA9876543210';

The string can be have 2 or more (I mean more) hexadecimal characters I wanted to group string by 2 like :

$output_string = 'FE:DC:BA:98:76:54:32:10';

I wanted to use regex for that, I think I saw a way to do like "recursive regex" but I can't remember it.

Any help开发者_如何学JAVA appreciated :)


If you don't need to check the content, there is no use for regex.

Try this

$outputString = chunk_split($string, 2, ":");
// generates: FE:DC:BA:98:76:54:32:10:

You might need to remove the last ":".

Or this :

$outputString = implode(":", str_split($string, 2));
// generates: FE:DC:BA:98:76:54:32:10

Resources :

  • www.w3schools.com - chunk_split()
  • www.w3schools.com - str_split()
  • www.w3schools.com - implode()

On the same topic :

  • Split string into equal parts using PHP


Sounds like you want a regex like this:

/([0-9a-f]{2})/${1}:/gi

Which, in PHP is...

<?php
$string = 'FE:DC:BA:98:76:54:32:10';
$pattern = '/([0-9A-F]{2})/gi';
$replacement = '${1}:';
echo preg_replace($pattern, $replacement, $string);
?>

Please note the above code is currently untested.


You can make sure there are two or more hex characters doing this:

if (preg_match('!^\d*[A-F]\d*[A-F][\dA-F]*$!i', $string)) {
  ...
}

No need for a recursive regex. By the way, recursive regex is a contradiction in terms. As a regular language (which a regex parses) can't be recursive, by definition.

If you want to also group the characters in pairs with colons in between, ignoring the two hex characters for a second, use:

if (preg_match('!^[\dA-F]{2}(?::[A-F][\dA-F]{2})*$!i', $string)) {
  ...
}

Now if you want to add the condition requiring tow hex characters, use a positive lookahead:

if (preg_match('!^(?=[\d:]*[A-F][\d:]*[A-F])[\dA-F]{2}(?::[A-F][\dA-F]{2})*$!i', $string)) {
  ...
}

To explain how this works, the first thing it does it that it checks (with a positive lookahead ie (?=...) that you have zero or more digits or colons followed by a hex letter followed by zero or more digits or colons and then a letter. This will ensure there will be two hex letters in the expression.

After the positive lookahead is the original expression that makes sure the string is pairs of hex digits.


Recursive regular expressions are usually not possible. You may use a regular expression recursively on the results of a previous regular expression, but most regular expression grammars will not allow recursivity. This is the main reason why regular expressions are almost always inadequate for parsing stuff like HTML. Anyways, what you need doesn't need any kind of recursivity.

What you want, simply, is to match a group multiple times. This is quite simple:

preg_match_all("/([a-z0-9]{2})+/i", $string, $matches);

This will fill $matches will all occurrences of two hexadecimal digits (in a case-insensitive way). To replace them, use preg_replace:

echo preg_replace("/([a-z0-9]{2})/i", $string, '\1:');

There will probably be one ':' too much at the end, you can strip it with substr:

echo substr(preg_replace("/([a-z0-9]{2})/i", $string, '\1:'), 0, -1);


While it is not horrible practice to use rtrim(chunk_split($string, 2, ':'), ':'), I prefer to use direct techniques that avoid "mopping up" after making modifications.

Code: (Demo)

$string = 'FEDCBA9876543210';
echo preg_replace('~[\dA-F]{2}(?!$)\K~', ':', $string);

Output:

FE:DC:BA:98:76:54:32:10

Don't be intimidated by the regex. The pattern says:

[\dA-F]{2}   # match exactly two numeric or A through F characters
(?!$)        # that is not located at the end of the string
\K           # restart the fullstring match

When I say "restart the fullstring match" I mean "forget the previously matched characters and start matching from this point forward". Because there are no additional characters matched after \K, the pattern effectively delivers the zero-width position where the colon should be inserted. In this way, no original characters are lost in the replacement.

0

精彩评论

暂无评论...
验证码 换一张
取 消