开发者

C# - Regex problem finding/displaying lines to RichTextBoxes

开发者 https://www.devze.com 2023-03-20 07:46 出处:网络
What I am trying to do is format a file and sort it into 3 different RichTextBoxes depending on the regular expressions that matches the line in the .txt file.

What I am trying to do is format a file and sort it into 3 different RichTextBoxes depending on the regular expressions that matches the line in the .txt file.

The text file I am opening looks like this:

POS     INFO        XINFO    YINFO    INFO  whatIWantToMatch  
J6      INT-00113G  227.905  174.994  180   SOIC8     
J3      INT-00113G  227.905  203.244  180   SOIC8     
U13     EXCLUDES    242.210  181.294  180   QFP128    
U3      IC-00276G   236.135  198.644  90    BGA48     
U12     IC-00270G   250.610  201.594  0     SOP8      
J1      INT-00112G  269.665  179.894  180   SOIC16    
J2      INT-00112G  269.665  198.144  180   SOIC16    
C44     EXCLUDES    237.910  193.469  0     0603_5    
C45     EXCLUDES    244.102  193.387  0     0603_5    
C76     CAP-00117G  227.710  198.594  0     0603_5    
C13     EXCLUDES    245.044  191.416  90    0402_2    
R12     RES-00458G  246.560  202.694  90    0402_2  

Here is my code:

    private void GCFormatButton_Click(object sender, EventArgs e)
    {
        // Resets the text in the placement rich text boxes.
        placementOneRichTextBox.ResetText();
        placementTwoRichTextBox.ResetText();
        userDefinedRichTextBox.ResetText();

        formatHelper();
        listFormatHelper();
    }

    private void formatHelper()
    {
        try
        {
            // Reads the lines in the file to format.
            var fileReader = File.OpenText(openGCFile.FileName);

            // Creates lists for the lines to be stored in.
            var placementOneList = new List<string>();
            var placementTwoList = new List<string>();
            var placementUserDefinedList = new List<string>();

            // Reads the first line and does nothing with it.
            fileReader.ReadLine();

            // Adds each line in the file to the list.
            while (true)
            {
                var line = fileReader.ReadLine();
                if (line == null)
                    break;

                placementOneList.Add(line);
                placementTwoList.Add(line);
                placementUserDefinedList.Add(line);
            }

            // Handles all of the requirements for placement type one.
            placementOneList = findPackagePlacementOneType(placementOneList);

            // Prints the formatted refs to the richtextbox.
            foreach (var line in placementOneList)
                placementOneRichTextBox.AppendText(line + "\n");

            // Handles the requirements for placement type two.
            placementTwoList = findPackagePlacementTwoType(placementTwoList);

            // Prints the formatted refs to the richtextbox.
            foreach (var line in placementTwoList)
                placementTwoRichTextBox.AppendText(line + "\n");

            // Handles all of the requirements for placement type one.
            placementUserDefinedList = findPackagePlacementChoiceType(placementUserDefinedList);

            // Prints the formatted refs to the richtextbox.
            foreach (var line in placementUserDefinedList)
                userDefinedRichTextBox.AppendText(line + "\n");
        }

        // Catches an exception if the file was not opened.
        catch (Exception)
        {
            MessageBox.Show("Could not format the text.", "Formatting Text Error",
                MessageBoxButtons.OK, MessageBoxIcon.Warning);
        }
    }

    private void listFormatHelper()
    {
        // Splits the lines in the rich text boxes
        var listOneLines = placementOneRichTextBox.Text.Split('\n');
        var listTwoLines = placementTwoRichTextBox.Text.Split('\n');
        var listUserLines = userDefinedRichTextBox.Text.Split('\n');

        // Resest the text in the listboxes
        placementOneListBox.ResetText();
        placementTwoListBox.ResetText();
        userDefinedListBox.ResetText();

        // Set the selection mode to multiple and extended.
        placementOneListBox.SelectionMode = SelectionMode.MultiExtended;
        placementTwoListBox.SelectionMode = SelectionMode.MultiExtended;
        userDefinedListBox.SelectionMode = SelectionMode.MultiExtended;

        //placementOneListBox.Data
        // Shutdown the painting of the ListBox as items are added.
        placementOneListBox.BeginUpdate();
        placementTwoListBox.BeginUpdate();
        userDefinedListBox.BeginUpdate();

        // Display the items in the listbox.
        placementOneListBox.DataSource = listOneLines;
        placementTwoListBox.DataSource = listTwoLines;
        userDefinedListBox.DataSource = listUserLines;

        // Allow the ListBox to repaint and display the ne开发者_如何学Cw items.
        placementOneListBox.EndUpdate();
        placementTwoListBox.EndUpdate();
        userDefinedListBox.EndUpdate();
    }

    static List<string> findPackagePlacementOneType(List<string> list)
    {
        // Creates a new list to return with new format.
        var result = new List<string>();

        // Checks each line in the list.
        foreach (var line in list)
        {
            // PLACEMENT ONE Regex
            Match regexRES = Regex.Match(line, @"RES.*");
            Match regex0402 = Regex.Match(line, @"0603.*");
            Match regex0201 = Regex.Match(line, @"0201.*");
            Match regex0603 = Regex.Match(line, @"0603.*");
            Match regex0805 = Regex.Match(line, @"0805.*");
            Match regex1206 = Regex.Match(line, @"1206.*");
            Match regex1306 = Regex.Match(line, @"1306.*");
            Match regex1608 = Regex.Match(line, @"1608.*");
            Match regex3216 = Regex.Match(line, @"3216.*");
            Match regex2551 = Regex.Match(line, @"2551.*");
            Match regex1913 = Regex.Match(line, @"1913.*");
            Match regex1313 = Regex.Match(line, @"1313.*");
            Match regex2513 = Regex.Match(line, @"2513.*");
            Match regex5125 = Regex.Match(line, @"5125.*");
            Match regex2525 = Regex.Match(line, @"2525.*");
            Match regex5619 = Regex.Match(line, @"5619.*");
            Match regex3813 = Regex.Match(line, @"3813.*");
            Match regex1508 = Regex.Match(line, @"1508.*");
            Match regex6431 = Regex.Match(line, @"6431.*");
            Match regex2512 = Regex.Match(line, @"2512.*");
            Match regex1505 = Regex.Match(line, @"1505.*");
            Match regex2208 = Regex.Match(line, @"2208.*");
            Match regex1005 = Regex.Match(line, @"1005.*");
            Match regex1010 = Regex.Match(line, @"1010.*");
            Match regex2010 = Regex.Match(line, @"2010.*");
            Match regex0505 = Regex.Match(line, @"0505.*");
            Match regex0705 = Regex.Match(line, @"0705.*");
            Match regex1020 = Regex.Match(line, @"1020.*");
            Match regex1812 = Regex.Match(line, @"1812.*");
            Match regex2225 = Regex.Match(line, @"2225.*");
            Match regex5764 = Regex.Match(line, @"5764.*");
            Match regex4532 = Regex.Match(line, @"4532.*");
            Match regex1210 = Regex.Match(line, @"1210.*");
            Match regex0816 = Regex.Match(line, @"0816.*");
            Match regex0363 = Regex.Match(line, @"0363.*");
            Match regexSOT = Regex.Match(line, @"SOT.*");

            if (regexRES.Success || regex0402.Success || regex0201.Success || regex0603.Success ||
                regex0805.Success || regex1206.Success || regex1306.Success || regex1608.Success ||
                regex3216.Success || regex2551.Success || regex1913.Success || regex1313.Success ||
                regex2513.Success || regex5125.Success || regex2525.Success || regex5619.Success ||
                regex3813.Success || regex1508.Success || regex6431.Success || regex2512.Success ||
                regex1505.Success || regex2208.Success || regex1005.Success || regex1010.Success ||
                regex2010.Success || regex0505.Success || regex0705.Success || regex1020.Success ||
                regex1812.Success || regex2225.Success || regex5764.Success || regex4532.Success ||
                regex1210.Success || regex0816.Success || regex0363.Success || regexSOT.Success)
            {
                result.Add(string.Join(" ", line));
            }

            else
                result.Remove(line);
        }

        // Returns the new list so it can be formatted further.
        return result;
    }

    static List<string> findPackagePlacementTwoType(List<string> list)
    {
        // Creates a new list to return with new format.
        var result = new List<string>();

        // Checks each line in the list.
        foreach (var line in list)
        {
            // PLACEMENT TWO Regex
            Match regexBGA = Regex.Match(line, @"BGA.*");
            Match regexSOP8 = Regex.Match(line, @"SOP8.*");
            Match regexQSOP = Regex.Match(line, @"QSOP.*");
            Match regexTQSOP = Regex.Match(line, @"TQSOP.*");
            Match regexSOIC16 = Regex.Match(line, @"SOIC16.*");
            Match regexSOIC12Plus = Regex.Match(line, @"SOIC12.*");
            Match regexSOIC8 = Regex.Match(line, @"SOIC8.*");
            Match regexSO8 = Regex.Match(line, @"SO8.*");
            Match regexSO08 = Regex.Match(line, @"SO08.*");
            Match regexCQFP = Regex.Match(line, @"CQFP.*");
            Match regexLCC = Regex.Match(line, @"LCC.*");
            Match regexLGA = Regex.Match(line, @"LGA.*");
            Match regexOSCCC = Regex.Match(line, @"OSCCC.*");
            Match regexPLCC = Regex.Match(line, @"PLCC.*");
            Match regexQFN = Regex.Match(line, @"QFN.*");
            Match regexQFP = Regex.Match(line, @"QFP.*");
            Match regexSOJ = Regex.Match(line, @"SOJ.*");
            Match regexSON = Regex.Match(line, @"SON.*");

            if (regexBGA.Success || regexSOP8.Success || regexQSOP.Success || regexTQSOP.Success ||
               regexSOIC16.Success || regexSOIC12Plus.Success || regexSOIC8.Success || regexSO8.Success ||
               regexSO08.Success || regexCQFP.Success || regexLCC.Success || regexLGA.Success ||
               regexOSCCC.Success || regexPLCC.Success || regexQFN.Success || regexQFP.Success ||
               regexSOJ.Success || regexSON.Success)
            {
                result.Add(string.Join(" ", line));
            }

            else
                result.Remove(line);
        }

        // Returns the new list so it can be formatted further.
        return result;
    }

    static List<string> findPackagePlacementChoiceType(List<string> list)
    {
        // Creates a new list to return with new format.
        var result = new List<string>();

        // Checks each line in the list.
        foreach (var line in list)
        {
            // PLACEMENT ONE Regex
            Match regexCAP = Regex.Match(line, @"CAP.*");
            Match regexIND = Regex.Match(line, @"IND.*");
            Match regexMELF = Regex.Match(line, @"MELF.*");
            Match regexDIOM = Regex.Match(line, @"DIOM.*");
            Match regexSOD = Regex.Match(line, @"SOD.*");
            Match regexSTO = Regex.Match(line, @"STO.*");
            Match regexTO = Regex.Match(line, @"TO.*");

            if (regexCAP.Success || regexIND.Success || regexMELF.Success || regexDIOM.Success ||
               regexSOD.Success || regexSTO.Success || regexTO.Success)
            {
                result.Add(string.Join(" ", line));
            }

            else
                result.Remove(line);
        }

        // Returns the new list so it can be formatted further.
        return result;
    }

However, with the REGEX that I have been using, I do not properly match what I want to. I would like to match the end of the file labeled above in the text file "whatIWantToMatch". Also, for some reason the function "findPackagePlacementChoiceType" is getting some of the same results as the "findPackagePlacementOneType" and it should not be.


QUESTIONS

  • Any suggestions on how I can make better regex?
  • Why is the finePackagePlacementChoiceType matching similar results as findPackagePlaceMentOneType?
  • Why does the findPackagePlacementOneType (and the other functions) not properly grab their match?
    • What I mean is, it may grab 2 of the 3 "603_5" endings instead of all 3...?


Since it doesn't look like there is much in common between the "whatIWantToMatch" values that you want to put in the same group, and you also know ahead of time what all possible values will be, you may want to consider using a simple if/else construct rather than regular expressions:

var placementOneList = new List<string>();
var placementTwoList = new List<string>();
var placementUserDefinedList = new List<string>();

// For each line in the file
foreach(string line in File.ReadAllLines("filename"))
{
    // Split the line to get only the "whatIWantToMatch" token
    // (Error handling omitted for simplicity)
    var match = line.Split(new String[] {" ", "\t"}, 
        StringSplitOptions.RemoveEmptyEntries)[5];

    // Put the line in the appropriate list depending upon its "whatIWantToMatch" value
    if(match.StartsWith("RES.") { placementOneList.Add(line); }
    else if(match.StartsWith("0603.") { placementOneList.Add(line); }
    // ...
    else if(match.StartsWith("BGA.") { placementTwoList.Add(line); }
    // ...
    else { throw new ApplicationException(); } // No match found
}


This line looks wrong:

Match regex0402 = Regex.Match(line, @"0603.*");

Shouldn't it be:

Match regex0402 = Regex.Match(line, @"0402.*");


Because you are simply OR matching your results, any line that matches ANY of those Regexes will be returned. You need to write a regex that might match the line structure itself.

For instance:

J3 INT-00113G 227.905 203.244 180 SOIC8

Might be matched by something like

^(\w+\d)\s(\w-\d{5}\w)\s(\d+\.\d+)\s(\d+\.\d+)\s(\d)\s(\w{4}\d)

But without knowing how this data varies, I find it difficult to know what might change from line to line. Check out the msdn article on regular expressions, and construct one that matches each case of line variation.

EDIT

Ok so after closer examination of your original questions, you want to match a specific string at the end of each line:

^.+(SOIC8)

Matches the line ending in SOIC8

Why does the findPackagePlacementOneType (and the other functions) not properly grab their match?

I just noticed that your some of your regex strings end in .* (which matches any character, 0 or more times...) Use \. for period, and be specific if it always ends in a number: \d

0

精彩评论

暂无评论...
验证码 换一张
取 消