开发者

I'm going to be teaching a few developers regular expressions - what are some good homework problems? [closed]

开发者 https://www.devze.com 2022-12-15 05:10 出处:网络
Closed. This question is opinion-based. It is not currently accepting answers. Want to improve this question? Update the question so it can be answered with facts and citations by editing
Closed. This question is opinion-based. It is not currently accepting answers.

Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.

Closed 9 years ago.

开发者_JS百科 Improve this question

I'm thinking of presenting questions in the form of "here is your input: [foo], here are the capture groups/results: [bar]" (and maybe writing a small script to test their answers for my results).

What are some good regex questions to ask? I need everything from beginner questions like "validate a 4 digit number" to "extract postal codes from addresses".


A few that I can think off the top of my head:

  1. Phone numbers in any format e.g. 555-5555, 555 55 55 55, (555) 555-555 etc.
  2. Remove all html tags from text.
  3. Match social security number (Finnish one is easy;)
  4. All IP addresses
  5. IP addresses with shorthand netmask (xx.xx.xx.xx/yy)


There's a bunch of examples of various regular expression techniques over at www.regular-expressions.info - everything for simple literal matching to backreferences and lookahead.


To keep things a bit more interesting than the usual email/phone/url stuff, try looking for more original exercises. Avoid boredom.

For example, have a look at the Forsysth-Edwards Notation which is used for describing a particular board position of a chess game.

Have your students validate and extract all the bits of information from a string like this:

rnbqkbnr/pp1ppppp/8/2p5/4P3/5N2/PPPP1PPP/RNBQKB1R b KQkq - 1 2

Additionaly, have a look at algebraic chess notation, used to describe moves. Extract chess moves out of a piece of text (and make them bold).

1. e4 e5 2. Nf3 Black now defends his pawn 2...Nc6 3. Bb5 Black threatens c4


  • Validate phone numbers (extract area code + rest of number with grouping) (Assuming US phone number, otherwise generalize for you style)
  • Play around with validating email address (probably want to tell the students that this is hugely complicated regular expression but for simple ones it is pretty straight forward)


regexplib.com has a good library you can search through for examples.


H0w about extract first name, middle name, last name, personal suffix (Jr., III, etc.) from a format like:

Smith III, John Paul

How about Reg Ex to remove line breaks and tabs from the input


I would start with the common ones:

  • validate email
  • validate phone number
  • separate the parts of a URL


Be cruel. Tell them parse HTML.

RegEx match open tags except XHTML self-contained tags


Are you teaching them theory of finite automata as well?

Here is a good one: parse the addresses of churches correctly from this badly structured format (copy and paste it as text first) http://www.churchangel.com/WEBNY/newhart.htm


I'm a fan of parsing date strings. Define a few common data formats, as well as time and date-time formats. These are often good exercises because some dates are simple mixes of digits and punctuation. There's a limited degree of freedom in parsing dates.


Just to throw them for a loop, why not reword a question or two to suggest that they write a regular expression to generate data fitting a specific pattern like email addresses, phone numbers, etc.? It's the same thing as validating, but can help them get out of the mindset that regex is just for validation (whereas the data generation tool in visual studio uses regex to randomly generate data).


Rather than teaching examples based from the data set, I would do examples from the perspective of the rule set to get basics across. Give them simple examples to solve that leads them to use ONE of several basic groupings in each solution. Then have a couple of "compound" regex's at the end.

Simple: s/abc/def/

Spinners and special characters: s/a\s*b/abc/

Grouping: s/[abc]/def/

Backreference: s/ab(c)/def$1/

Anchors: s/^fred/wilma/ s/$rubble/and betty/

Modifiers: s/Abcd/def/gi

After this, I would give a few examples illustrating the pitfalls of trying to match html tags or other strings that shouldn't be done with regex's to show the limitations.


Try to think of some tests that don't include ones that can be found with Google.

Asking a email validator should pose no trouble finding..

Try something like a 5 proof test.

Input 5 digit. Sum up each digit must be dividable by five: 12345 = 1+2+3+4+5 = 15 / 5 = 3(.0)

0

精彩评论

暂无评论...
验证码 换一张
取 消