开发者

Finding words in treetop - some matches not being made

开发者 https://www.devze.com 2023-03-29 17:44 出处:网络
I\'ve run into a bit of a strange situation. I\'m trying to parse measurements using treetop. For instance - 6\' of 1/2\" Copper Pipe

I've run into a bit of a strange situation.

I'm trying to parse measurements using treetop.

For instance - 6' of 1/2" Copper Pipe of course, this can also be written as feet, Feet, inch, inches, Inch, inch, etc. etc.

so I have a rule

rule measurement
      ('\'' / 'Foot' / 'foot' / 'Feet' / 'feet' / 
       '"' / 'Inches' / 'inches' /  'Inch' / 'inch' /
       'cm' / 'cms' / 'Centimeters' / 'centimeters' /  'Centimeter' / 'centimeter' / 
       'm' / 'ms' / 'Meters' / 'meters'/ 'Meter' / 'meter' / 
       'lb' / 'lbs' /  'Pounds' / 'pounds' / 'Pound' / 'pound' )
       (s? ')' / s) {
                    def value
                          [:measurement, text_value]
                    end
                    }
end

rule space
    [\s]+
end

When I enter '6 inches', '6 pounds', '6 Meters', everything works great, and I get my number and measurement returned.

When I enter '6 meters', meters isn't parsed properly.

Most of the measurements work fine, only 'meters' and 'pound' are being missed in the measurements I've provided here (but I'm sure I'll be adding more measurements in the future.

Any ideas as to why I would be experiencing this?

As per request, a more 'pared down' version of the full grammar

grammar FullMeasurements
       rule full_product
           measur开发者_运维知识库es s? alternate_measure product_name {
             def value
                  [:full_product, text_value]
             end
           }

       end

       rule measures
        single_measure / dual_measure / quantity {
            def measures
                [:measures, text_value] unless text_value.blank?
            end
        }
    end


    rule dual_measure
        quantity s? single_measure {
            def value
                [:dual_measure, text_value] unless text_value.blank?
            end

            }
    end


    rule alternate_measure 
        '(' s? single_measure {
            def value
                [:alternate_measure, text_value] unless text_value.blank?
            end
        }
    end

    rule single_measure 
        (range_number / number) s? measurement optional_secondary_measurements  {
            def value
                [:single_measure, text_value]
            end
        }
    end

    rule optional_secondary_measurements
        measurement? {
            def value
                [:optional_secondary_measurements, text_value]
            end
        }
    end



    rule quantity
        (range_number / number) s? divisor? {
            def value
                [:quantity, text_value]
            end
        }
    end

        rule measurement
              ('\'' / 'Foot' / 'foot' / 'Feet' / 'feet' / 
               '"' / 'Inches' / 'inches' /  'Inch' / 'inch' /
               'cm' / 'cms' / 'Centimeters' / 'centimeters' /  'Centimeter' / 'centimeter' / 
               'm' / 'ms' / 'Meters' / 'meters'/ 'Meter' / 'meter' / 
               'lb' / 'lbs' /  'Pounds' / 'pounds' / 'Pound' / 'pound' )
                (s? ')' / s) {
                    def value
                          [:measurement, text_value]
                    end
                    }
         end



        rule divisor
        "x" 
    end

    rule product_name
            !measures words+ {
            def value
                [:product_name, text_value]
            end
        }
    end


    rule number 
     frac_number / regular_number optional_frac {
            def value
                [:number, text_value]
            end
        }
        end



        rule optional_frac
        frac_number? {
            def value
                [:optional_frac, text_value]
            end
        }
         end



         rule frac_number
        (s? regular_number '/' regular_number)  {
            def value
                [:frac_number, text_value]
            end
        }
        end

        rule words
        [0-9a-zA-Z\-()&.%'*\s]+ {
            def value
                text_value
            end 
        }

          end

        rule regular_number
        [0-9\.]+ {
            def value
                text_value
            end 
        }

        end

        rule space
          [\s]+
         end
end


Since PEGs are greedy and / is an ordered alternation, your measurement rule matches the literal text "meter" and then your grammar fails because it cannot find a following rule that matches the left over "s". Unlike regular expressions, PEGs will not backtrack through previous successful matches when a later one fails.

Switch the order of items in your rule to have the plurals first, and you should be good to go.


Phrogz was on the right track, but it's not "meter" being matched first, but 'm' that leaves nothing to match the "eter" or "eters" that's left over.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号