I'm trying to make a program using NASM that takes input from command line a开发者_如何转开发rguments. Since string length is not provided, I'm trying to make a function to compute my own. Here is my attempt, which takes a pointer to a string in the ebx
register, and returns the length of the string in ecx
:
len:
push ebx
mov ecx,0
dec ebx
count:
inc ecx
inc ebx
cmp ebx,0
jnz count
dec ecx
pop ebx
ret
My method is to go through the string, character by character, and check if it's null. If it's not, I increment ecx
and go to the next character. I believe the problem is that cmp ebx,0
is incorrect for what I'm trying to do. How would I properly go about checking whether the character is null? Also, are there other things that I could be doing better?
You are comparing the value in ebx
with 0 which is not what you want. The value in ebx
is the address of a character in memory so it should be dereferenced like this:
cmp byte[ebx], 0
Also, the last push ebx
should be pop ebx
.
Here is how I do it in a 64-bit Linux executable that checks argv[1]
. The kernel starts a new process with argc
and argv[]
on the stack, as documented in the x86-64 System V ABI.
_start:
pop rsi ; number of arguments (argc)
pop rsi ; argv[0] the command itself (or program name)
pop rsi ; rsi = argv[1], a pointer to a string
mov ecx, 0 ; counter
.repeat:
lodsb ; byte in AL
test al,al ; check if zero
jz .done ; if zero then we're done
inc ecx ; increment counter
jmp .repeat ; repeat until zero
.done:
; string is unchanged, ecx contains the length of the string
; unused, we look at command line args instead
section .rodata
asciiz: db "This is a string with 36 characters.", 0
This is slow and inefficient, but easy to understand.
For efficiency, you'd want
- only 1 branch in the loop (Why are loops always compiled into "do...while" style (tail jump)?)
- avoid a false dependency by loading with
movzx
instead of merging into the previous RAX value (Why doesn't GCC use partial registers?). - subtract pointers after the loop instead of incrementing a counter inside.
And of course SSE2 is always available in x86-64, so we should use that to check in chunks of 16 bytes (after reaching an alignment boundary). See optimized hand-written strlen
implementations like in glibc. (https://code.woboq.org/userspace/glibc/sysdeps/x86_64/strlen.S.html).
Here how I would have coded it
len:
push ebx
mov eax, ebx
lp:
cmp byte [eax], 0
jz lpend
inc eax
jmp lp
lpend:
sub eax, ebx
pop ebx
ret
(The result is in eax). Likely there are better ways.
精彩评论