开发者

Unique file identifier in windows

开发者 https://www.devze.com 2022-12-13 08:15 出处:网络
Is there are way to uniquely identify a file (and possibly directories) for the lifetime of the file regardless of moves, renames and content modifications?(Windows 2000 and later). 开发者_运维百科 Ma

Is there are way to uniquely identify a file (and possibly directories) for the lifetime of the file regardless of moves, renames and content modifications? (Windows 2000 and later). 开发者_运维百科 Making a copy of a file should give the copy it's own unique identifier.

My application associates various meta-data with individual files. If files are modified, renamed or moved it would be useful to be able to automatically detect and update file associations.

FileSystemWatcher can provide events that inform of these sorts of changes, however it uses a memory buffer that can be easily filled (and events lost) if many file system events occur quickly.

A hash is no use because the content of the file can change, and so the hash will change.

I had thought of using the file creation date, however there are a few situations where this will not be unique (ie. when multiple files are copied).

I've also heard of a file SID (security ID?) in NTFS, but I'm not sure if this would do what I'm looking for.

Any ideas?


Here's sample code that returns a unique File Index.

ApproachA() is what I came up with after a bit of research. ApproachB() is thanks to information in the links provided by Mattias and Rubens. Given a specific file, both approaches return the same file index (during my basic testing).

Some caveats from MSDN:

Support for file IDs is file system-specific. File IDs are not guaranteed to be unique over time, because file systems are free to reuse them. In some cases, the file ID for a file can change over time.

In the FAT file system, the file ID is generated from the first cluster of the containing directory and the byte offset within the directory of the entry for the file. Some defragmentation products change this byte offset. (Windows in-box defragmentation does not.) Thus, a FAT file ID can change over time. Renaming a file in the FAT file system can also change the file ID, but only if the new file name is longer than the old one.

In the NTFS file system, a file keeps the same file ID until it is deleted. You can replace one file with another file without changing the file ID by using the ReplaceFile function. However, the file ID of the replacement file, not the replaced file, is retained as the file ID of the resulting file.

The first bolded comment above worries me. It's not clear if this statement applies to FAT only, it seems to contradict the second bolded text. I guess further testing is the only way to be sure.

[Update: in my testing the file index/id changes when a file is moved from one internal NTFS hard drive to another internal NTFS hard drive.]

    public class WinAPI
    {
        [DllImport("ntdll.dll", SetLastError = true)]
        public static extern IntPtr NtQueryInformationFile(IntPtr fileHandle, ref IO_STATUS_BLOCK IoStatusBlock, IntPtr pInfoBlock, uint length, FILE_INFORMATION_CLASS fileInformation);

        public struct IO_STATUS_BLOCK
        {
            uint status;
            ulong information;
        }
        public struct _FILE_INTERNAL_INFORMATION {
          public ulong  IndexNumber;
        } 

        // Abbreviated, there are more values than shown
        public enum FILE_INFORMATION_CLASS
        {
            FileDirectoryInformation = 1,     // 1
            FileFullDirectoryInformation,     // 2
            FileBothDirectoryInformation,     // 3
            FileBasicInformation,         // 4
            FileStandardInformation,      // 5
            FileInternalInformation      // 6
        }

        [DllImport("kernel32.dll", SetLastError = true)]
        public static extern bool GetFileInformationByHandle(IntPtr hFile,out BY_HANDLE_FILE_INFORMATION lpFileInformation);

        public struct BY_HANDLE_FILE_INFORMATION
        {
            public uint FileAttributes;
            public FILETIME CreationTime;
            public FILETIME LastAccessTime;
            public FILETIME LastWriteTime;
            public uint VolumeSerialNumber;
            public uint FileSizeHigh;
            public uint FileSizeLow;
            public uint NumberOfLinks;
            public uint FileIndexHigh;
            public uint FileIndexLow;
        }
  }

  public class Test
  {
       public ulong ApproachA()
       {
                WinAPI.IO_STATUS_BLOCK iostatus=new WinAPI.IO_STATUS_BLOCK();

                WinAPI._FILE_INTERNAL_INFORMATION objectIDInfo = new WinAPI._FILE_INTERNAL_INFORMATION();

                int structSize = Marshal.SizeOf(objectIDInfo);

                FileInfo fi=new FileInfo(@"C:\Temp\testfile.txt");
                FileStream fs=fi.Open(FileMode.Open,FileAccess.Read,FileShare.ReadWrite);

                IntPtr res=WinAPI.NtQueryInformationFile(fs.Handle, ref iostatus, memPtr, (uint)structSize, WinAPI.FILE_INFORMATION_CLASS.FileInternalInformation);

                objectIDInfo = (WinAPI._FILE_INTERNAL_INFORMATION)Marshal.PtrToStructure(memPtr, typeof(WinAPI._FILE_INTERNAL_INFORMATION));

                fs.Close();

                Marshal.FreeHGlobal(memPtr);   

                return objectIDInfo.IndexNumber;

       }

       public ulong ApproachB()
       {
               WinAPI.BY_HANDLE_FILE_INFORMATION objectFileInfo=new WinAPI.BY_HANDLE_FILE_INFORMATION();

                FileInfo fi=new FileInfo(@"C:\Temp\testfile.txt");
                FileStream fs=fi.Open(FileMode.Open,FileAccess.Read,FileShare.ReadWrite);

                WinAPI.GetFileInformationByHandle(fs.Handle, out objectFileInfo);

                fs.Close();

                ulong fileIndex = ((ulong)objectFileInfo.FileIndexHigh << 32) + (ulong)objectFileInfo.FileIndexLow;

                return fileIndex;   
       }
  }


If you call GetFileInformationByHandle, you'll get a file ID in BY_HANDLE_FILE_INFORMATION.nFileIndexHigh/Low. This index is unique within a volume, and stays the same even if you move the file (within the volume) or rename it.

If you can assume that NTFS is used, you may also want to consider using Alternate Data Streams to store the metadata.


Please take a look here: Unique file ids for Windows. This is also useful: Unique ID for files on NTFS?


One thing you can use for file uids is the create timestamp, just ensure when you scan the files into your program you tweak any createtimes that are the same as one already encountered so they are minutely different, as obviously a few files even on NTFS can otherwise have the same TS if they were created at the same time by a mass file copy, but in the normal course of things you won't get duplicates and tweaking will be few if any. d


The user also mentions unique directory identification. That process is a bit more convoluted than retrieving unique information for a file; however, it is possible. It requires you to call the appropriate CREATE_FILE function which a particular flag. With that handle, you can call the GetFileInformationByHandle function in Ash's answer.

This also requires a kernel32.dll import:

        [DllImport("kernel32.dll", SetLastError = true)]
        public static extern SafeFileHandle CreateFile(
            string lpFileName,
            [MarshalAs(UnmanagedType.U4)] FileAccess dwDesiredAccess,
            [MarshalAs(UnmanagedType.U4)] FileShare dwShareMode,
            IntPtr securityAttributes,
            [MarshalAs(UnmanagedType.U4)] FileMode dwCreationDisposition,
            uint dwFlagsAndAttributes,
            IntPtr hTemplateFile
        );

I'll flesh out this answer a bit more, later. But, with the above linked answer, this should begin to make sense. A new favorite resource of mine is pinvoke which has helped me with .Net C# signature possibilities.

0

精彩评论

暂无评论...
验证码 换一张
取 消