Pan Fashion Home > Who Knows > Download Gzipped CSV File and insert contents into MYSql table

Download Gzipped CSV File and insert contents into MYSql table


And finally, here it is all together, download the gzipped csv to your server, unpack it on the server, read the first two lines of the csv to create the table and import the data.

You don’t need to know the structure of the csv, this will automatically construct the mysql table for you.

<?php
//Configuration

$dbhost = 'localhost'; //normally localhost

$dbuser = 'your_dbase_username'; //your dbase username

$dbpass = 'your_dbase_password'; //your dbase password

$conn = mysql_connect($dbhost, $dbuser, $dbpass) or die('Error connecting to mysql');

$dbname = 'your_dbase_name'; //your dbase name

$tablename = 'your_table_name'; //which doesn't have to exist

mysql_select_db($dbname);

$url = "your_download_url";   // insert your download url for a gzipped, csv file

$filename = "temp.csv.gz"; // the name you want for the saved gzipped file

$newfilename='temp.csv'; //change this to whatever filename you want
?>
<?php

// maximum execution time in seconds
set_time_limit(120);
//set memory limit
ini_set("memory_limit","3072M");

$newfname = $filename;

$file = fopen ($url, "rb");
	if ($file) {
  		$newf = fopen ($newfname, "wb");

  	if ($newf)
  		while(!feof($file)) {
    		fwrite($newf, fread($file, 1024 ), 1024 );
  		}
	}

	if ($file) {
  		fclose($file);
	}

	if ($newf) {
  		fclose($newf);
	}

$zd = gzopen($filename, "r");
$contents = gzread($zd, 10000000); //because I have no idea how big your file will be
gzclose($zd);

$tempvar = fopen($newfilename,"w");
fwrite($tempvar, $contents);
fclose($tempvar);

?>
<?php
// GENERATE TABLE WITH FIELD_NAMES FROM FIRST LINE OF CSV FILE AND FIELD TYPES FROM SECOND LINE

$inputFile = $newfilename;

//OPEN CSV FILE FOR READING
$infile = fopen($inputFile, 'r');
//READ 5K, WHICH SHOULD BE ENOUGH TO READ THE FIRST TWO LINES FROM ANY CSV FILE
$contents = fread($infile, 5120);
// CLOSE CSV FILE
fclose($infile);
//CREATE AN ARRAY OF THE LINES FROM THE CSV FILE
$infilelines = explode("\n", $contents);
// CREATE AN ARRAY OF FIELD NAMES FROM THE FIRST LINE OF THE CSV
$fieldnames = explode(',', $infilelines[0]); // CHANGE THIS TO '\t' IF ITS TAB DELIMITED OR '|' IF ITS PIPE DELIMITED
// CREATE AN ARRAY OF FIELD CONTENT FROM THE SECOND LINE OF THE CSV
$fieldtype = explode(',', $infilelines[1]);
// BEGIN WRITING THE SQL QUERY
$thequery = 'CREATE TABLE IF NOT EXISTS  '.$tablename."\n\t".' ('."\n\n"; 

// THE LOOP
for($i = 0; $i <= count($fieldtype)-1; $i++) {  if(strlen($fieldtype[$i]) == 0) $type = 'char(80)';
// IF THERE IS A NUMBER DEFINE IT AS INT
if(preg_match('/[0-9]/', $fieldtype[$i])) $type = 'int(11)';
// IF ITS IN THIS FOMRAT 0000-00-00 DEFINE IT AS A DATE (thanks James)
if(preg_match('/\d\d\d\d-\d\d-\d\d/', $fieldtype[$i])) $type = 'datetime';
// IF THERE IS A FORWARD SLASH DEFINE IT AS A DATE
if(preg_match('/[\/]/', $fieldtype[$i])) $type = 'datetime';
// IF THERE IS A DECIMAL POINT DEFINE IT AS A DOUBLE (PROBABLY A CURRENCY)
if(preg_match('/[\.]/', $fieldtype[$i])) $type = 'double';
// THIS WILL BE THE DEFAULT VARCHAR(255) AFTER ALL THE OTHER OPTIONS HAVE BEEN CHECKED
if(preg_match('/[a-z\\\']/i', $fieldtype[$i])) $type = 'char(255)';
// IF ITS LONGER THAN 255 CHARACTERS DEFINE IT AS LONGTEXT
if(strlen($fieldtype[$i]) >=255) $type = 'longtext';
 if ($i == count($fieldtype)-1) {
 $thequery .= "\t".trim($fieldnames[$i])."\t".$type."\n";
 }
 else
 {
 $thequery .= "\t".$fieldnames[$i]."\t".$type.', '."\n";
 }

} // END THE LOOP
//FINISH THE SQL QUERY
$thequery .= ')';
$thequery = str_replace('-','_',$thequery);
$thequery = str_replace('condition','condition1',$thequery);
echo $thequery;
$result = mysql_query($thequery);

?>

<?php
//load the data from the csv file into the database
$sql = 'LOAD DATA LOCAL INFILE \''.$newfilename.'\' INTO TABLE '.$tablename.' FIELDS TERMINATED BY \',\' ENCLOSED BY \'"\' IGNORE 1 LINES';
$result = mysql_query($sql) or die (mysql_error()); 

?>

<?php
//close dbase connection
mysql_close($conn);
?>
  1. #1 by Quentin at August 25th, 2009

    Much the same as your Awin Create-a-feed products into your mysql database script, my hosting won’t allow remote fopen but will accept curl but I’ve no idea how to put this in the script. Any chance of a quick update?

  2. #2 by admin at August 25th, 2009

    Quentin, you will need to replace lines 31 to 57 with the following code: (and possibly increase the time limit)

    $ch = curl_init();
    	curl_setopt($ch, CURLOPT_URL, $url);
        curl_setopt($ch, CURLOPT_HEADER, 0);
       	curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
       	$content = curl_exec($ch);
      	curl_close($ch);
    
    	$tempvar = fopen($filename,"w");
    	fwrite($tempvar, $content);
    	fclose($tempvar); 
    
     $zd = gzopen($filename, "r");
     $contents = gzread($zd, 1000000000); //because I have no idea how big your file will be
     gzclose($zd); 
    
     $tempvar = fopen($newfilename,"w");
     fwrite($tempvar, $contents);
     fclose($tempvar);
    
  3. #3 by Quentin at August 25th, 2009

    Thanks for the update, just gave it a go and got the following error:

    Warning: fopen(temp.csv.gz) [function.fopen]: failed to open stream: Permission denied in /home/wwwblah/public_html/test4.php on line 38

    Warning: fwrite(): supplied argument is not a valid stream resource in /home/wwwblah/public_html/test4.php on line 39

    Warning: fclose(): supplied argument is not a valid stream resource in /home/wwwblah/public_html/test4.php on line 40

    Warning: gzopen(temp.csv.gz) [function.gzopen]: failed to open stream: No such file or directory in /home/wwwblah/public_html/test4.php on line 42

    Warning: gzread(): supplied argument is not a valid stream resource in /home/wwwblah/public_html/test4.php on line 43

    Warning: gzclose(): supplied argument is not a valid stream resource in /home/wwwblah/public_html/test4.php on line 44

    Warning: fopen(temp.csv) [function.fopen]: failed to open stream: Permission denied in /home/wwwblah/public_html/test4.php on line 46

    Warning: fclose(): supplied argument is not a valid stream resource in /home/wwwblah/public_html/test4.php on line 48

    Warning: fopen(temp.csv) [function.fopen]: failed to open stream: No such file or directory in /home/wwwblah/public_html/test4.php on line 58

    Warning: fread(): supplied argument is not a valid stream resource in /home/wwwblah/public_html/test4.php on line 60

    Warning: fclose(): supplied argument is not a valid stream resource in /home/wwwblah/public_html/test4.php on line 62
    CREATE TABLE IF NOT EXISTS awin2 ( char(80) ) Table ‘wwwblah_awin.awin2′ doesn’t exist

  4. #4 by admin at August 25th, 2009

    Thats odd Quentin, as it works fine for me, the error is basically telling you that the file isn’t being downloaded, silly question I know, but, did you put a URL in the code?

    If you like you can email the whole page to me at admin@currentstyle.co.uk and I’ll see whats going wrong

  5. #5 by Quentin at August 26th, 2009

    Working with admin, he worked out that if you have the same problem you need to create the file ‘temp.csv.gz’ and ‘temp.csv’ on your server and give them 777 permissions. The script should work fine after this.

  6. #6 by Steve at September 27th, 2009

    This looks like exactly what I’m after but I think I’m making a schoolboy error as I can’t make it work at the moment.

    How exactly am I supposed to get the script to run? At present I’ve uploaded it and typed the url of the php file into my browser. This gets half the job done, the gzip and csv files then appear in the same directory but nothing happens to my database.

    Any ideas? Any help would be much appreciated.

    Steve.

  7. #7 by Steve at September 27th, 2009

    By the way the message I get is this:

    CREATE TABLE IF NOT EXISTStmobdeals ( product_code int(11), product_name char(255), brand_name char(255), level1 char(80), level2 char(255), level3 char(255), level4 char(255), level5 char(255), mapped_cat_level1 char(80), mapped_cat_level2 char(80), mapped_cat_id char(80), description char(80), buyat_short_deeplink_url char(255), old_style_deeplink_url char(255), image_url char(255), image_url_2 char(255), currency char(255), online_price char(255), offline_price char(255), recurring_charge double, old_price char(80), delivery_cost char(80), delivery_time char(80), availability double, promotion_text char(255), best_seller char(255), warranty char(80), condition1 char(80), offer_type char(80), manufacturer_code char(80), keywords char(80), location char(80), duration char(80), date_from char(80), date_to char(80), Make char(80), Name char(80), DeviceCostIncVAT char(255), SIMOnly char(255), Tariff double, Contract_term char(255), MonthlyLineRentalIncVAT char(255), AllowanceAmount int(11), AllowanceApplicability double, minutes char(80), texts char(80), data int(11), monetary_allowance int(11), device_description char(80), fieldO char(80), fieldP longtext, fieldQ char(80), fieldR char(80), fieldT char(80), fieldU char(80), fieldV char(80), fieldW char(80), fieldX char(80), fieldY char(80), fieldZ char(80), level6 char(80), level7 char(80), level8 char(80), char(80), char(80), char(80) ) Table ‘thedes8_tmobtest.tmobdeals’ doesn’t exist

  8. #8 by admin at September 28th, 2009

    Steve, how about, if you open up phpmyadmin and then paste this into the SQL box

    CREATE TABLE IF NOT EXISTS tmobdeals ( product_code int(11), product_name char(255), brand_name char(255), level1 char(80), level2 char(255), level3 char(255), level4 char(255), level5 char(255), mapped_cat_level1 char(80), mapped_cat_level2 char(80), mapped_cat_id char(80), description char(80), buyat_short_deeplink_url char(255), old_style_deeplink_url char(255), image_url char(255), image_url_2 char(255), currency char(255), online_price char(255), offline_price char(255), recurring_charge double, old_price char(80), delivery_cost char(80), delivery_time char(80), availability double, promotion_text char(255), best_seller char(255), warranty char(80), condition1 char(80), offer_type char(80), manufacturer_code char(80), keywords char(80), location char(80), duration char(80), date_from char(80), date_to char(80), Make char(80), Name char(80), DeviceCostIncVAT char(255), SIMOnly char(255), Tariff double, Contract_term char(255), MonthlyLineRentalIncVAT char(255), AllowanceAmount int(11), AllowanceApplicability double, minutes char(80), texts char(80), data int(11), monetary_allowance int(11), device_description char(80), fieldO char(80), fieldP longtext, fieldQ char(80), fieldR char(80), fieldT char(80), fieldU char(80), fieldV char(80), fieldW char(80), fieldX char(80), fieldY char(80), fieldZ char(80), level6 char(80), level7 char(80), level8 char(80), char(80), char(80), char(80) )

  9. #9 by Steve at September 29th, 2009

    I’ll give that a try, thanks.

  10. #10 by Andy Baxter at October 2nd, 2009

    This is a fantastic script and I have “nearly” got it working. I have downloaded CSV feed from Awin (Pipe delimited). I think I am correct in that I have to change lines 74, 76 and 114 (the end SQL statement).

    However, the table is being created just fine but the data isn’t being put in.

    I get the following error:-

    CREATE TABLE products1 ( merchant_id int(11), merchant_name varchar(255), aw_product_id int(11), merchant_product_id varchar(255), product_name varchar(255), description varchar(255), category_id int(11), category_name varchar(255), merchant_category varchar(255), aw_deep_link varchar(255), aw_image_url varchar(255), search_price double, delivery_cost double, merchant_deep_link varchar(255), merchant_image_url varchar(255) ) The used command is not allowed with this MySQL version

    So there’s obviously something in the SQL statement there that it doesn’t like.

    I think I am on mysql version 4.1.25.

    Any clues? I am desperate to get this working because I have been trying all day to write something of my own but eventually gave up!

  11. #11 by admin at October 2nd, 2009

    Hi Andy, the problem is that your database user doesn’t have the server priviliges to ‘LOAD DATA LOCAL INFILE’ via php.

    The only way I can see around it, requires that you have access to your mysql/data directory on the server, which I’d be surprised if you have.

    Is there any chance you could ask your hosting provider to give the necessary priviliges to the database user?

  12. #12 by Andy Baxter at October 2nd, 2009

    Don’t even ask how I worked this out but changing the db connection line to the following sorts it out…

    $client_flags = 128;

    $conn = mysql_connect($dbhost, $dbuser, $dbpass, FALSE, $client_flags);

    Now all the products are in, I am going to have a play with it for a while but a cursory look at the mysql data looks just fine and if that is the case then thanks a lot for your invaluable help!

  13. #13 by admin at October 2nd, 2009

    I’ve never even seen that 128 flag before, but on further investigation, it’s the flag for allowing the loading of local files (if it’s not explicitly allowed in mysql), good work Andy.

    You’ll find that the data has total integrity, and will exactly mirror what you downloaded.

    It’s my pleasure, any more help, you can email me admin@currentstyle.co.uk or just continue on here.

  14. #14 by Anna @ Pink Digital Camera at January 31st, 2010

    Thanks for this script – it works a treat :-)

    One question… What would be your recommendation for updating the data in the DB.

    Is it better to simply start by deleting all the data in the table (TRUNCATE TABLE…) and run the script again, or would it be better to update the details based on the AW Product ID ?

    Thanks for your advice.

  15. #15 by admin at January 31st, 2010

    I think it entirely depends on whether you mind having out of date products in your table, you could do an INSERT IGNORE INTO to only bring in the new products, my preference is to truncate, then update.

  16. #16 by Rod at March 13th, 2010

    Hello,

    Awesome script, but I have a few questions to get it to work for my purpose.
    I am trying to download a csv feed from Ticketmaster. I want it to display events that are going on sale for today, but also give them the option to see previous sales. Is there anyway to update the data daily, but still keep the old data? And the big question…can this be automated?

    Thanks for any help.

    Rod

  17. #17 by Michael at June 10th, 2010

    Hi,

    This script is fantastic, thanks a lot!

    I would however really appreciate it if you could possibly add the ability for the script to detect duplicate entries so those entries are not added to the database again?

    That would be perfect!

    Kind Regards,
    Michael

  18. #18 by admin at June 10th, 2010

    Hi Michael, there isn’t a facility to use the IGNORE command when using LOAD DATA (other than to specify the number of lines to ignore), so, the only way I can see around this would be to create a new table, then run a SQL query on the existing table, then use an ‘INSERT IGNORE INTO’ to shift the freshly downloaded data into the new table, which would then not overwrite any rows that already existed.

(will not be published)

Comment moderation is enabled. Your comment may take some time to appear.