No announcement yet.

Extraction of "User-Generated" Files from Forensic Image

  • Filter
  • Time
  • Show
Clear All
new posts

  • Extraction of "User-Generated" Files from Forensic Image

    Is it possible to use OSForensics to identify and extract "User-Generated" files from a forensic image file?

    I have imaged seven workstations to E01 format.

    1. My next step is to extract out just "User-Generated" files from each of the seven forensic images; I have a defined list of "User-Generated" file extensions, which I need to export maintaining the original file/folder paths, including:


    2. Once I have a discrete folder for each of the seven forensic image files, our practice will then index all of the "User-Generated" files for attorney review in Relativity.

    So, the resulting folder of data to be indexed for Relativity review, using the example of John Smith's forensic image, would look like:


    In future OSF releases, if it is not possible currently, it would be very, very helpful if there was a view in OSForensics which automatically organized files by file type in discrete folders, such as all Word files in a Word folder, all Email files in an email folder, all SQLite database files in a SQLite database folder, etc. Then, for example, if a user could just select the folders of data one wanted, and then with another mouse click export the folder(s) of "User-Generated" files either maintaining the original folder path, or simply loose in one top level folder, this would make OSForensics a fantastic staging tool for electronic discovery cases. Basically, it is best practice to segregate user generated files from system files before ingesting data into electronic discovery processing tools such as LAW or Nuix as the customer is traditionally charged for the GB count of files ingested into the electronic discovery processing tools; billing a customer for some forensic hours to segregate out user generated files typically saves the customer thousands of dollars and avoids the needless processing of system files.

  • #2
    Does this approach make sense?

    Some of the potential problems are,
    • Export files to a different file system means you can lose meta data (like who owns the file) and lose file time accuracy. Then down the track you are also likely to loose information like the file access time.
    • Saving files with their original path runs the risk of exceeding the file path length limits.
    • Some paths can not be saved in some file systems (e.g. FAT32 and NTFS have different allowed characters).
    • You can lose NTFS streams. For example if a file has been downloaded from the Internet that fact it stored in an alternative NTFS file stream.
    • Any information that might have in the file slack space is lost
    • Any deleted records in the MFT are lost ($I30 records, etc..)
    • Loss of shadow copies of files
    • There is the real risk of a file name collision and data loss unless you are really careful about using different folders for each volume. So for example, D:\Document.doc and E:\Document.doc overwrite each other once you export them to a new folder.
    • Loss of image thumb nails.
    Also there would seem to be many document types that a user might create aren't listed in your list of files types. (e.g. JPG, GIF, ODS, MP4, DBX, RAR, TXT, etc...)

    If you just want all files of certain types dumped into a single folder it is fairly easy. As an example, steps are,
    1. Go to the File Name search window in OSF.
    2. Select Office documents from the drop down list (you can customize this drop down list to have your list of file extensions)
    3. Click on Search. Then CTRL-A to select all the documents found.
    4. Right click, then from the menu select Saved checked items to disk.
    There is another option as well, which is maybe even better. Instead of "Saving" the files to disk. Add them to the open case instead. Once in the case they are stored in this folder,
    C:\Users\<UserName>\Documents\PassMark\OSForensics \Cases\<Casename>\Files\<HashedPath>\<FileName>

    This has the advantage that
    a) you can't get a file name collision
    b) you can't overflow the max path length
    c) a lot of meta data is stored with the file, like SHA1 hash values, accurate NTFS times & the original volume name.

    You can then supply this \<Casename>\Files\ folder to the client.

    If the end game is to get just these files indexed and searchable, then you can do this all within OSF. You can select just these file types in the create index process and index them. Then provide the searchable index to the client. Maybe you don't need some expensive ediscovery tool.