Contents

ChatGPT: Convert JSON to CSV, and generate some code to automate.

Contents

The Goal was to have a go program that converts JSON data i have on a project to a CSV file. Looks like that was easy for ChatGPT so did a few extra steps to generate some Golang code, first naively, then streaming through the input json file. What follows is the ChatGPT conversation.

You: turn this json snippet into csv: [ { “Id”: 0, “Date”: “2022-11-01T09:00:00+09:00”, “Project”: null, “Cost”: 5000, “Estimate”: 6000, “Updated”: “0001-01-01T00:00:00Z” }, { “Id”: 0, “Date”: “2022-10-01T09:00:00+09:00”, “Project”: null, “Cost”: 4000, “Estimate”: 5000, “Updated”: “0001-01-01T00:00:00Z” } ]


ChatGPT:

Id,Date,Project,Cost,Estimate,Updated 0,2022-11-01T09:00:00+09:00,,5000,6000,0001-01-01T00:00:00Z 0,2022-10-01T09:00:00+09:00,,4000,5000,0001-01-01T00:00:00Z


You: generate a go script that can convert json files to csv files like the ones above


ChatGPT: Here is an example Go script that can convert JSON files to CSV files:

package main

import (
	"encoding/csv"
	"encoding/json"
	"fmt"
	"os"
)

func main() {
	// Open the input JSON file
	inputFile, err := os.Open("input.json")
	if err != nil {
		fmt.Println(err)
		return
	}
	defer inputFile.Close()

	// Decode the JSON data
	var data []map[string]interface{}
	err = json.NewDecoder(inputFile).Decode(&data)
	if err != nil {
		fmt.Println(err)
		return
	}

	// Open the output CSV file
	outputFile, err := os.Create("output.csv")
	if err != nil {
		fmt.Println(err)
		return
	}
	defer outputFile.Close()

	// Create a new CSV writer
	writer := csv.NewWriter(outputFile)

	// Write the header row
	header := []string{"Id", "Date", "Project", "Cost", "Estimate", "Updated"}
	err = writer.Write(header)
	if err != nil {
		fmt.Println(err)
		return
	}

	// Write the data rows
	for _, item := range data {
		row := []string{
			fmt.Sprintf("%v", item["Id"]),
			fmt.Sprintf("%v", item["Date"]),
			fmt.Sprintf("%v", item["Project"]),
			fmt.Sprintf("%v", item["Cost"]),
			fmt.Sprintf("%v", item["Estimate"]),
			fmt.Sprintf("%v", item["Updated"]),
		}
		err = writer.Write(row)
		if err != nil {
			fmt.Println(err)
			return
		}
	}

	// Flush the writer
	writer.Flush()
}
This script reads the input JSON file and decodes it into a slice of maps, where each map represents a row in the CSV file. It then writes the header row and the data rows to the output CSV file. The values in the CSV file are represented as strings, using the `fmt.Sprintf` function to convert them from their original data types.

You: the script is great but it loads the whole file into memory .. rewrite the script so that it streams the input file


ChatGPT: Here’s an updated version of the Go script that streams the input file to minimize memory usage:

package main

import (
	"encoding/csv"
	"encoding/json"
	"fmt"
	"os"
)

func main() {
	// Open the input JSON file
	inputFile, err := os.Open("input.json")
	if err != nil {
		fmt.Println(err)
		return
	}
	defer inputFile.Close()

	// Open the output CSV file
	outputFile, err := os.Create("output.csv")
	if err != nil {
		fmt.Println(err)
		return
	}
	defer outputFile.Close()

	// Create a new CSV writer
	writer := csv.NewWriter(outputFile)

	// Write the header row
	header := []string{"Id", "Date", "Project", "Cost", "Estimate", "Updated"}
	err = writer.Write(header)
	if err != nil {
		fmt.Println(err)
		return
	}

	// Decode the JSON data and write the data rows
	decoder := json.NewDecoder(inputFile)
	for {
		var item map[string]interface{}
		err := decoder.Decode(&item)
		if err != nil {
			break
		}

		row := []string{
			fmt.Sprintf("%v", item["Id"]),
			fmt.Sprintf("%v", item["Date"]),
			fmt.Sprintf("%v", item["Project"]),
			fmt.Sprintf("%v", item["Cost"]),
			fmt.Sprintf("%v", item["Estimate"]),
			fmt.Sprintf("%v", item["Updated"]),
		}
		err = writer.Write(row)
		if err != nil {
			fmt.Println(err)
			return
		}
	}

	// Flush the writer
	writer.Flush()
}
This updated script uses the `json.Decoder` to stream the input file, decoding one JSON object at a time and immediately writing the corresponding row to the output CSV file. This way, only a small amount of data is stored in memory at any given time, minimizing the risk of running out of memory for large input files.