开发者

How to implement Benford's law in MATLAB

开发者 https://www.devze.com 2022-12-26 08:35 出处:网络
I want to implement a version of Benford\'s law (http://en.wikipedia.org/wiki/Benford%27s_law) that basically a开发者_StackOverflowsks for the first digit of a number to do analysis on the distributio

I want to implement a version of Benford's law (http://en.wikipedia.org/wiki/Benford%27s_law) that basically a开发者_StackOverflowsks for the first digit of a number to do analysis on the distribution.

1934---> 1
0.04 ---> 4
-56 ---> 5

How do you do this in MATLAB?


function res = first_digit(number)
    number = abs(number);
    res = floor(number / (10 ^ floor(log10(number))));
end

It works for all real numbers (see gnovice's comment for an extreme case)


A few ways you can do this...

  • Using REGEXP:

    wholeNumber = 1934;                      %# Your number
    numberString = num2str(wholeNumber,16);  %# Convert to a string
    matches = regexp(numberString,'[1-9]','match');  %# Find matches
    firstNumber = str2double(matches{1});  %# Convert the first match to a double
    
  • Using ISMEMBER:

    wholeNumber = 0.04;                      %# Your number
    numberString = num2str(wholeNumber,16);  %# Convert to a string
    isInSet = ismember(numberString,'123456789');  %# Find numbers that are
                                                   %#  between 1 and 9
    numberIndex = find(isInSet,1);           %# Get the first number index
    firstNumber = str2double(numberString(numberIndex));  %# Convert to a double
    

EDIT:

Some discussion of this topic has arisen on one of the MathWorks blogs. Some interesting additional solutions are provided there. One issue that was brought up was having vectorized solutions, so here's one vectorized version I came up with:

numberVector = [1934 0.04 -56];
numberStrings = cellstr(num2str(numberVector(:),16));
firstIndices = regexp(numberStrings,'[1-9]','once');
firstNumbers = cellfun(@(s,i) s(i),numberStrings,firstIndices);


Using log10 and floor built in functions,

floor(x./10.^floor(log10(x)))

returns the first digit of all elements in an array as well.


Let me add another string-based solution (vectorized as well):

FirstDigit = @(n) sscanf(num2str(abs(n(:)),'%e'), '%1d', numel(n));

and tested on the cases mentioned here:

>> FirstDigit( [1934 0.04 -56 eps(realmin)] )
ans =
     1
     4
     5
     4
0

精彩评论

暂无评论...
验证码 换一张
取 消