Tuesday, July 01, 2008

NPI Search Redundancy

Most healthcare professionals know at this point that all providers of health care, require NPI (National Provider Identifier) numbers. Without one, it will become increasingly difficult for claims to be paid by commercial payers, and impossible to collect medicare and medicaid payments.

Since we are a coding/billing/collection/management agency, we have frequented the NPPES (National Plan & Provider Enumeration System) to lookup NPI information.

Unfortunately, there have been brief periods of downtime of the site, causing us to implement our own solution: a backup of the registry.

I have a daily cron job that downloads the NPI database, push it into MySQL, giving us an albeit slow, but accurate access to an off-line version of this data.

The shell script below performs the retrieval:

#!/bin/sh
WORKDIR="/srv/htdocs/npi"
LOG="$WORKDIR/npi.log"
MONTH=`date +%b`
LASTMONTH=`date +%b --date='1 month ago'`
YEAR=`date +%Y`
URL="http://nppesdata.cms.hhs.gov/"
CFILE="NPPES_Data_Dissemination_${MONTH}_${YEAR}.zip"
LFILE="NPPES_Data_Dissemination_${LASTMONTH}_${YEAR}.zip"

cd $WORKDIR

# Attempt to download the file. --progress=dot:mega is used to prevent
# WGET_OUTPUT from getting too long.
FILE=$(2>&1 wget --timestamping --progress=dot:mega \
           "${URL}${CFILE}")


if [ "$?" -eq "0" ]; then
     echo "Retrieved zip file for ${MONTH} on `date`." > $LOG
     BOOL="1"
else
     FILE=$(2>&1 wget --timestamping --progress=dot:mega \
           "${URL}${LFILE}")
     if [ "$?" -eq "0" ]; then
             echo "Retrieved zip file for ${MONTH} on `date`." > $LOG
     else
             echo "There was a problem retreiving the file." > $LOG
     fi
fi

if [ "$BOOL" -eq "1" ]; then
    #Find npidata in zip file
    OFILE=`unzip -l ${CFILE} | grep csv | grep -vi 'fileheader' | awk '{ print $4 }'`

    #Now unzip it
    unzip ${CFILE} ${OFILE}
else
    #Find npidata in zip file
    OFILE=`unzip -l ${LFILE} | grep csv | grep -vi 'fileheader' | awk '{ print $4 }'`
   
    #Now unzip it
    unzip ${LFILE} ${OFILE}
fi

#Rename file
mv ${OFILE} source.csv

#Import data
echo "Importing ${OFILE} data..." >> ${LOG}
mysql -uroot -pmypass < /srv/htdocs/npi/npi.sql echo "Imported ${OFILE}." >> ${LOG}

#Remove source data
rm -f /srv/htdocs/npi/source.csv

#Remove the zip file
rm -f /srv/htdocs/npi/*.zip

Here is the npi.sql file used to import the data into the database. The data imported is trimmed down, since we didn't need all of it.

USE npi;
TRUNCATE TABLE npidata;
LOAD DATA INFILE '/srv/htdocs/npi/source.csv' INTO TABLE npidata FIELDS TERMINATED BY ',' ENCLOSED BY '"' LINES TERMINATED BY '\r\n' IGNORE 1 LINES
(npi,entity_type_code,replacement_npi,ein,legal_business_name,last_name,first_name,
middle_name,prefix,suffix,credential,other_organization_name,other_organization_name_type_code,
other_last_name,other_first_name,other_middle_name,other_prefix,other_suffix,other_credential,
other_last_name_type_code,first_line_business_mailing_address,second_line_business_mailing_address,
business_mailing_address_city_name,business_mailing_address_state_name,business_mailing_address_postal_code,
business_mailing_address_country_code,business_mailing_address_telephone_number,
business_mailing_address_fax_number,first_line_business_practice_location_address,
second_line_business_practice_location_address,business_practice_location_address_city_name,
business_practice_location_address_state_name,business_practice_location_address_postal_code,
business_practice_location_address_country_code,business_practice_location_address_telephone_number,
business_practice_location_address_fax_number,enumeration_date,last_update_date,
npi_deactivation_reason_code,npi_deactivation_date,npi_reactivation_date,gender,authorized_official_last_name,
authorized_official_first_name,authorized_official_middle_name,authorized_official_title_or_position,
authorized_official_telephone_number);

And here is a look at the MySQL table used for this

mysql> describe npidata;
+-----------------------------------------------------+--------------+------+-----+---------+-------+
| Field                                               | Type         | Null | Key | Default | Extra |
+-----------------------------------------------------+--------------+------+-----+---------+-------+
| npi                                                 | varchar(20)  |      | PRI |         |       |
| entity_type_code                                    | char(2)      | YES  |     | NULL    |       |
| replacement_npi                                     | varchar(20)  | YES  |     | NULL    |       |
| ein                                                 | varchar(20)  | YES  |     | NULL    |       |
| legal_business_name                                 | varchar(255) | YES  |     | NULL    |       |
| last_name                                           | varchar(255) | YES  |     | NULL    |       |
| first_name                                          | varchar(255) | YES  |     | NULL    |       |
| middle_name                                         | varchar(255) | YES  |     | NULL    |       |
| prefix                                              | varchar(255) | YES  |     | NULL    |       |
| suffix                                              | varchar(255) | YES  |     | NULL    |       |
| credential                                          | varchar(255) | YES  |     | NULL    |       |
| other_organization_name                             | varchar(255) | YES  |     | NULL    |       |
| other_organization_name_type_code                   | varchar(10)  | YES  |     | NULL    |       |
| other_last_name                                     | varchar(255) | YES  |     | NULL    |       |
| other_first_name                                    | varchar(255) | YES  |     | NULL    |       |
| other_middle_name                                   | varchar(255) | YES  |     | NULL    |       |
| other_prefix                                        | varchar(255) | YES  |     | NULL    |       |
| other_suffix                                        | varchar(255) | YES  |     | NULL    |       |
| other_credential                                    | varchar(255) | YES  |     | NULL    |       |
| other_last_name_type_code                           | varchar(10)  | YES  |     | NULL    |       |
| first_line_business_mailing_address                 | varchar(255) | YES  |     | NULL    |       |
| second_line_business_mailing_address                | varchar(255) | YES  |     | NULL    |       |
| business_mailing_address_city_name                  | varchar(255) | YES  |     | NULL    |       |
| business_mailing_address_state_name                 | varchar(5)   |      | PRI |         |       |
| business_mailing_address_postal_code                | varchar(20)  | YES  |     | NULL    |       |
| business_mailing_address_country_code               | varchar(5)   | YES  |     | NULL    |       |
| business_mailing_address_telephone_number           | varchar(20)  | YES  |     | NULL    |       |
| business_mailing_address_fax_number                 | varchar(20)  | YES  |     | NULL    |       |
| first_line_business_practice_location_address       | varchar(255) | YES  |     | NULL    |       |
| second_line_business_practice_location_address      | varchar(255) | YES  |     | NULL    |       |
| business_practice_location_address_city_name        | varchar(255) | YES  |     | NULL    |       |
| business_practice_location_address_state_name       | varchar(5)   | YES  |     | NULL    |       |
| business_practice_location_address_postal_code      | varchar(20)  | YES  |     | NULL    |       |
| business_practice_location_address_country_code     | varchar(5)   | YES  |     | NULL    |       |
| business_practice_location_address_telephone_number | varchar(20)  | YES  |     | NULL    |       |
| business_practice_location_address_fax_number       | varchar(20)  | YES  |     | NULL    |       |
| enumeration_date                                    | varchar(15)  | YES  |     | NULL    |       |
| last_update_date                                    | varchar(15)  | YES  |     | NULL    |       |
| npi_deactivation_reason_code                        | varchar(255) | YES  |     | NULL    |       |
| npi_deactivation_date                               | varchar(15)  | YES  |     | NULL    |       |
| npi_reactivation_date                               | varchar(15)  | YES  |     | NULL    |       |
| gender                                              | char(2)      | YES  |     | NULL    |       |
| authorized_official_last_name                       | varchar(255) | YES  |     | NULL    |       |
| authorized_official_first_name                      | varchar(255) | YES  |     | NULL    |       |
| authorized_official_middle_name                     | varchar(255) | YES  |     | NULL    |       |
| authorized_official_title_or_position               | varchar(255) | YES  |     | NULL    |       |
| authorized_official_telephone_number                | varchar(20)  | YES  |     | 0       |       |
+-----------------------------------------------------+--------------+------+-----+---------+-------+

I then use a very simple php page to serve query's of the database:

<?php
function formatPhoneNumber($strPhone) {

        if (strlen($strPhone) != 10) {
                $strPrefix = substr($strPhone, 0, 3);
                $strNumber = substr($strPhone, 6, 4);
                $strPhone = $strPrefix."-".$strNumber;
                return $strPhone;
        }

      $strArea = substr($strPhone, 0, 3);
      $strPrefix = substr($strPhone, 3, 3);
      $strNumber = substr($strPhone, 6, 4);

       $strPhone = "(".$strArea.") ".$strPrefix."-".$strNumber;

       return ($strPhone);
}


function formatZipCode($strZipCode) {

        if (strlen($strZipCode) !=9) {
                return $strZipCode;
        }

        $strFirst = substr($strZipCode, 0, 5);
        $strSecond = substr($strZipCode, 5, 4);
        $strZipCode = $strFirst."-".$strSecond;
        return $strZipCode;

}

$db = "npi";
$user = "npi";
$pass = "npi";
$host = "localhost";

$con = mysql_connect($host, $user, $pass) or die("Could not connect; " . mysql_error());
$db_selected = mysql_select_db($db);

$mode=$_GET["mode"];

if ($mode > '1') {
?>
      <table class="report"><tbody>
<?php
         $sql = "SELECT * FROM npidata WHERE npi = \"$mode\"";
        $result = mysql_query($sql,$con);

        //echo $sql . "\n"; //debug

        $row = mysql_fetch_row($result);

        //echo "<pre>";
        //print_r($row); //debug
        //echo "</pre>";

        echo "<tr><td><span class=label>NPI</span></td><td>".$row[0]."</td></tr>";
        echo "<tr><td><span class=label>Full Name</span></td><td>".$row[8]." ".$row[6]." ".$row[7]." ".$row[5]."</td></tr>";
        echo "<tr><td><span class=label>Mailing Address</span></td><td>".$row[19]."<br>".$row[20]."<br>".$row[21]."<br>".$row[22].", ".$row[23]." ".formatZipCode($row[24])."</td></tr>";
        echo "<tr><td><span class=label>Phone</span></td><td>".formatPhoneNumber($row[26])."</td></tr>";
        echo "<tr><td><span class=label>Fax</span></td><td>".formatPhoneNumber($row[27])."</td></tr>";
        echo "<tr><td><span class=label>Practice Address</span></td><td>".$row[28]."<br>".$row[29]."<br>".$row[30].", ",$row[31]." ".formatZipCode($row[32])."</td></tr>";
        echo "<tr><td><span class=label>Phone</span></td><td>".formatPhoneNumber($row[34])."</td></tr>";
        echo "<tr><td><span class=label>Fax</span></td><td>".formatPhoneNumber($row[35])."</td></tr>";
        echo "</tbody></table>";

        mysql_free_result($result);
        mysql_close($con);

} else if ($_SERVER['REQUEST_METHOD'] != 'POST') {
?>

<form name="form1" action="<?php echo $_SERVER['PHP_SELF'];?>" method="post" onsubmit="return validate_form(this)">
<table border="0" cellspacing="5">
<tr><td>Last Name</td><td><input type="text" size="30" id="lname" name="lname"></td></tr>
<tr><td>First Name</td><td><input type="text" size="30" id="fname" name="fname"></td></tr>
<tr><td>State</td><td>
<select name="state" size="1" id="state">
<option value=""><select state here></option>
<option value="AL">Alabama</option>
<option value="AK">Alaska</option>
<option value="AZ">Arizona</option>
<option value="AR">Arkansas</option>
<option value="CA">California</option>
<option value="CO">Colorado</option>
<option value="CT">Connecticut</option>
<option value="DE">Delaware</option>
<option value="DC">Dist of Columbia</option>
<option value="FL">Florida</option>
<option value="GA">Georgia</option>
<option value="HI">Hawaii</option>
<option value="ID">Idaho</option>
<option value="IL">Illinois</option>
<option value="IN">Indiana</option>
<option value="IA">Iowa</option>
<option value="KS">Kansas</option>
<option value="KY">Kentucky</option>
<option value="LA">Louisiana</option>
<option value="ME">Maine</option>
<option value="MD">Maryland</option>
<option value="MA">Massachusetts</option>
<option value="MI">Michigan</option>
<option value="MN">Minnesota</option>
<option value="MS">Mississippi</option>
<option value="MO">Missouri</option>
<option value="MT">Montana</option>
<option value="NE">Nebraska</option>
<option value="NV">Nevada</option>
<option value="NH">New Hampshire</option>
<option value="NJ">New Jersey</option>
<option value="NM">New Mexico</option>
<option value="NY">New York</option>
<option value="NC">North Carolina</option>
<option value="ND">North Dakota</option>
<option value="OH">Ohio</option>
<option value="OK">Oklahoma</option>
<option value="OR">Oregon</option>
<option value="PA">Pennsylvania</option>
<option value="RI">Rhode Island</option>
<option value="SC">South Carolina</option>
<option value="SD">South Dakota</option>
<option value="TN">Tennessee</option>
<option value="TX">Texas</option>
<option value="UT">Utah</option>
<option value="VT">Vermont</option>
<option value="VA">Virginia</option>
<option value="WA">Washington</option>
<option value="WV">West Virginia</option>
<option value="WI">Wisconsin</option>
<option value="WY">Wyoming</option>
</select>

</td></tr>
<tr><td> </td><td><input value="Search" type="submit"></td></tr>
</tbody></table>


       <a href="javascript:history.go(-1)">Cancel</a>
<?php
} else if ($_SERVER['REQUEST_METHOD'] = 'POST') {

$fname = $_POST["fname"];
$fname = strtoupper($fname);

$lname = $_POST["lname"];
$lname = strtoupper($lname);

$state = $_POST["state"];
$state = strtoupper($state);

if ($fname == "") {
#RUN QUERY WITHOUT FNAME
        $sql = "SELECT * FROM npidata WHERE UPPER(last_name) = \"$lname\" AND UPPER(business_mailing_address_state_name) = \"$state\"";
} else {
        $sql = "SELECT * FROM npidata WHERE UPPER(last_name) = \"$lname\" AND UPPER(first_name) = \"$fname\" AND UPPER(business_mailing_address_state_name) = \"$state\"";
}

$result = mysql_query($sql,$con);

//echo $sql . "\n"; //debug
?>
<table class="report"><tbody>
<tr>
       <th>NPI</th>
       <th>First Name</th>
       <th>Last Name</th>
       <th>Address</th>
       <th>City</th>
       <th>State</th>
<form name="form1" method="post">
<?php

        while ($row = mysql_fetch_array($result)) {

                //echo "<pre>";
                //print_r(mysql_fetch_array($result)); //debug
                //echo "</pre>";

                echo "<tr><td><a href=\"" . $_SERVER['PHP_SELF'] . "?mode="  . $row[0] . "\">" . $row[0] . "</a></td>";
                echo "<td>" . $row[6] . "</td>";
                echo "<td>" . $row[5] . "</td>";
                echo "<td>" . $row[20] . "<br>" . $row[21] . "</td>";
                echo "<td>" . $row[22] . "</td>";
                echo "<td>" . $row[23] . "</td></tr>";

        } //end while

?>

       </form>
       </tbody>
       </table>
<?php
mysql_free_result($result);
mysql_close($con);

} // end if

?>

10 comments:

Sentinel said...

Great script John. I am able to download the data and unzip it. But how do you import that into the db? Can you please explain where you get the /srv/htdocs/npi/npi.sql file.

Thanks.

John Croson said...

Thanks, Sentinel. Seems I left out some important bits.... I've included the npi.sql script, and a snapshot of the MySQL table.

Hope that helps you!

Mike said...

Hi John

This is an interesting project. We have put the npi database online with a pretty quick search routine at www.e-physician.info. It is updated regularly, and uses a backend of oracle xe with php scripting.

Regards
Mike Clark
Coconut Island Software, Inc.

Aruna Jx said...

Hi John,
Great Code..
Thank you very much for sharing.
I used your code and was a success, except the source.csv file unzipped to only ~477 MB. But manual archiving gives about 3GB [file].csv file.

Can you please tell me if there is a known reason..?

Advanced Thanks.

John Croson said...

Thanks Aruna.

What version of unzip are you using? I could only surmise that you have an old version that has file size limits, or perhaps you have quotas enabled?

Hope that helps. Good luck.

bbrooke said...

Thanks so much for posting your script for downloading the NPI file via wget. When I clicked the link for the monthly ZIP file on the NPI web site, my downloaded file was always truncated... So, I learned something new (wget) and actually got a complete NPI file by using your script.

BTW, I think the new root URL is http://nppes.viva-it.com/ (replaces http://nppesdata.cms.hhs.gov/ )

William Entriken said...

Thanks John. Can I be a huge pain and ask for `SHOW CREATE TABLE npidata` please.

John Croson said...

William, you'll want to refer to my "describe npidata" information above. I've long since abandoned this project.

NpiDataSource said...

We are in the process of making the NPI database available for sale in MySql and Sql-Server. The sample database is located here if you'd like to check out the schema and DDL. What we do is take the raw .csv process to Sql-Server and then "Migrate" it to MySql. The only transformations we do is properly case the Address fields and de-duplicate the Licences, Identifiers, and Taxonomies.

http://npidatasource.com/samples/MySql_Sample_Output.zip

Let me know what you think... support@npidatasource.com.

NpiDataSource said...

I forgot a pertinent point. Both Sql-Server and MySql editions cost $50.00 per instance.

Thanks,

NpiDataSource.com