I a have a string that contains the code of a webpage.
This is an example:
开发者_Python百科<input type="text" name="x4B07" value="650"
onchange="this.form.x8000.value=this.name;this.form.submit();"/>
<input type="text" name="x4B08" value="250"
onchange="this.form.x8000.value=this.name;this.form.submit();"/>
In that string I want to get the 650
and 250
(these are variables and they change value).
How can I do so?
Example:
name | value |
---|---|
x4b08 | 254 |
x4b07 | 253 |
x4b06 | 252 |
x4b05 | 251 |
If you were confident that the markup would never change (and you have a simple snippet like your example line) a regex could get you those values, for example:
Regex re = new Regex("name=\"(.*?)\" value=\"(.*?)\"");
Match match = re.Match(yourString);
if(match.Success && match.Groups.Count == 3){
String name = match.Groups[1];
String value = match.Groups[2];
}
Alternatively you could parse the page content and query the resulting document for the elements, and then extract the values. (C# HTML Parser: Looking for C# HTML parser )
You can use regular expressions to match value="([0-9]*)"
Or you can look for the string "value" using string.IndexOf
and then take the following few characters.
This should work for you (assuming that s contains the string you want to parse):
string value = s.Substring(s.IndexOf("value=")+7);
value = value.Substring(0, value.IndexOf("\""));
How specific are your examples? Could you also want to extract varying length alphabetic strings? Will the strings you want to extract always be properties?
While the regex/substring way works for the specified examples I think they will scale quite badly.
I'd parse the HTML using a parser (see ndtreviv's answer) or possibly with an XML parser (if the HTML is valid XHTML). That way you will get better control and don't have to bleed your eyes out from fidgeting with a bucketload of regex.
If you have multiple such controls in the form of string you can create and XmlDocument and iterate through it.
just solved with this
HttpWebRequest req = (HttpWebRequest)WebRequest.Create(URL);
HttpWebResponse resp = (HttpWebResponse)req.GetResponse();
Stream st = resp.GetResponseStream();
StreamReader sr = new StreamReader(st);
string buffer = sr.ReadToEnd();
ArrayList uniqueMatches = new ArrayList();
Match[] retArray = null;
Regex RE = new Regex("name=\"(.*?)\" value=\"(.*?)\"", RegexOptions.Multiline);
MatchCollection theMatches = RE.Matches(buffer);
for (int counter = 0; counter < theMatches.Count; counter++)
{
//string[] tempSplit = theMatches[counter].Value.Split('"');
Regex reName = new Regex("name=\"(.*?)\"");
Match matchName = reName.Match(theMatches[counter].Value);
Regex reValue = new Regex("value=\"(.*?)\"");
Match matchValue = reValue.Match(theMatches[counter].Value);
string[] dados = new string[2];
dados[0] = matchName.Groups[1].ToString();
dados[1] = matchValue.Groups[1].ToString();
uniqueMatches.Add(dados);
}
Tks all for the help
精彩评论