No announcement yet.

NSRL hash import

  • Filter
  • Time
  • Show
Clear All
new posts

  • NSRL hash import

    My NSRL RDS modern minimum hash set install has been running for days and is creeping along now with 8.6 million read and incrementing at about 7-8 hashs per screen update. I don't want to cancel for fear of losing all this time. Is this normal performance. I have a fairly powerful HP Z820 with 64 gb and 10 cores. If abnormal any tweaks I can do to boost througput going forward?

  • #2
    Using a RAM drive (if you have memory to spare) or a fast SSD is recommended. Some additional information can be found in the section import duration on the tutorial page for importing hashsets.


    • #3
      The CSV to database conversion process is mostly disk bound (and some dependency on single threaded core performance). Especially as the size of the database grows near the end of the import process. So lots of cores don't help for a disk bound process. That HP machine seems to use E5-2660 v2, which is 4 years old. I am also guessing you using old style spinning for the location of the source & destination files.

      We'll see if we can add a bit more detail around the process of using RAM drives for the conversion, as it might be 10x faster than using an old style HDD.


      • #4
        Thank you both for your replies. Will try out and post result. Best regards.


        • #5
          Here are some detailed steps on how to speed up an NSRL import using a temporary RAM drive:


          1. In OSForensics click on the "Mount Drive Image" button. The OSFMount window should open. Click "Mount new...".
          2. Under Source, select "Empty RAM drive".
          3. Under drive size, allocate enough memory for the input files, the output files, and the OSForensics USB installation (500MB). The input and output file size will vary depending on the NSRL hash set version. As an example, the Dec 2016 NSRL was made up of around 15GB of (uncompressed csv text) as the input data. Once the data was imported with OSF, the resulting database was around 20GB. The minimal NSRL set is smaller however. So the total size of the RAM drive needs to be at least 36GB. Future sets might be larger so a 40GB or 50GB drive would be safer. If RAM is limited just placing the database on the RAM drive is also an option (leaving the input file on a SSD).
          4. Uncheck "Read-only drive".
          5. Check "Mount as removable media". This is important as it simulates a USB drive.
          6. Leave remaining default values and click "OK". The RAM drive should then be created. In older versions of OSFMount you need to manually format the drive from Windows.
          7. Copy the uncompressed NSRL data to the RAM drive you just created. If the input data is separated into disks, you can do this one disk at a time.
          8. In OSForensics click on the "Install to USB" menu option.
          9. Select the RAM drive you just created, enter your license key and click Install. Then close OSF.
          10. Launch OSForensics.exe from the OSForensics folder on the RAM drive.
          11. Open the Hash Sets module and click "New DB...".
          12. Enter a name for the database and click OK. Note that there is a bug with the current version where the import buttons don't enable until the software is reopened.
          13. Ensure the database you created is the active database using the right-click menu or the "Make Active" button. The icon will turn yellow.
          14. Click "NSRL Import...", browse to the NSRL data on the RAM drive, click "OK" and allow the import to complete. If importing one disk at a time then repeat the import process on the same database. It is wise to also backup the database in between disks in case an error occurs.
          15. Once the database has been fully created, you can copy it over from the temporary RAM drive (e.g. "F:\OSForensics\AppData\hashSets\NSRLDiskA.OSF Hash Set") to the default location (e.g. "C:\ProgramData\PassMark\OSForensics\hashSets" ).

          IMPORTANT: The contents of the RAM drive will be lost if the machine is powered down or crashes. So you need a really stable machine to do this. At the end of the import process you MUST copy the database back to the hard drive. Even using a RAM drive the import process is slow due to the vast amount of data and can take many hours.

          In V6 we'll look at making the process of using a RAM drive easier.


          • #6
            We just imported the NSRL Dec 2017 Revision 2.59 (Full Modern RDS with ~121 Million records). It still took a few days, even with a large RAM drive.

            Size if the CSV input files were ~14.4GB.
            Size of the indexed database file was ~20.7GB
            So the size of the RAM drive needs to be at least 36GB. We would suggest slightly higher, 40GB, just be safe.

            The reduced sets (Minimal or Unique) should be significantly quicker to convert.
            Once imported database lookups were really quick despite the large number of records.

            We are looking at improving the import speed for V6 however. Would be better if it didn't take days with the full set.


            • #7
              Update: For OSForensics V6 we have made two significant changes to the NSRL import process.
              1) We have made the process of using a RAM drive much easier. So you can specify a temp folder (on a RAM drive) for the import process.
              2) We have made some assumptions about the alphabetic sorting of hashes in the NSRL source files. If we assume the data in pre-sorted (as it seems to be in the latest releases) we can save a bunch of processing during the import.

              With these changes we were able to speed up the import process 8 times. So it is now measured in hours instead of days.