Update complete

Posted on Dec 11, 2015 in generatedata.com, Open Source Projects | 0 comments

Alrighty! generatedata.com has been update to 3.2.4. This was a pretty big update – the site hadn’t been updated in over a year (!). It includes lots of database changes, features and more. Please send me an email if you find any bugs.

Breaking Changes

This is the first time an upgrade has deliberately introduced a breaking change. I hated to do it, but it had to be done. The old Credit Card Data Type has been dropped. Please use the PAN Data Type in the “Credit Card Data” group.

New Features

Please note, the free downloadable version of generatedata.com also has this feature and everything else discussed here. Not that I don’t love people donating (sure helps paying the hosting bills!) but just so you know. :)

  • Data Set backups. For anyone with an account on the site, whenever you save a data set, it’ll automatically make a backup of it. It retains the last 200 edits you make to a data set. So, in case you accidentally mess something up, you can just browse the old saved versions to get it back.
  • Configurable Plugins. Over time, the plugin list have grown, and some aren’t going to be as useful to some people as others. So now, when you log into the site you’ll see a new Settings tab. There, you can uncheck whatever Data Types, Export Types and Country-specific plugins you don’t want.
  • New Plugins & more country data. Chile and Sweden have been added to the country list, and there are few more plugins, most country-specific.
  • Bug fixes. Lots!

Anyway, enjoy. As mentioned, send me an email if you spot anything wonky.

Read More

generatedata.com pending update

Posted on Dec 6, 2015 in generatedata.com, Open Source Projects | 0 comments

Just a heads up, I’ll be updating the public generatedata.com site soon. Hopefully tonight, but no promises: I don’t want to rush it. The live site has been running an old version of the script for some time now (3.1.4 from Sept of last year). There have been a lot of improvements since then, so this’ll mean a pretty big update.

[EDIT: Gah, abort! Bug found. I'm going to need to release a new generate data build first before returning to the main website].

Read More

generatedata 3.2.2

Posted on Nov 15, 2015 in generatedata.com, Open Source Projects | 0 comments

Yesterday I released a new version of generatedata. It’s been in development off and on for several months, so it’s nice to see it launched.

Today I’m going to wade through the list of issues that have piled up and fix the most significant ones, so expect a bug fix release to come out pretty fast.

Then… definitely time to update the website! That’s gotten pretty out of date.

Read More

generatedata.com site bug

Posted on May 26, 2015 in generatedata.com, Open Source Projects | 4 comments

So this is strange. The last 24 hours or so I’ve noticed that the generatedata.com website sometimes fails to load. The JS errors don’t make terribly much sense, it’s like the JS isn’t fully loading sequentially. I haven’t changed the code in several months.

Seems like it’s local to Chrome only. Firefox seems fine. I’m going to monitor it – possibly it’s a Chrome issue that was introduce in 44.x.

Read More

generatedata 3.2.1

Posted on May 25, 2015 in generatedata.com, Open Source Projects | 2 comments

I just released a new version of generatedata. You can download the free standalone script from the github repo here. This new version fixes a few smaller issues that have been reported, but most significantly it includes a great new feature to back up your data set configurations. Now, any time you save a data set it automatically saves a new copy of the configuration. Loading it will always default to the latest copy, but in case you ever need it, you can just click in the History link for that data set in the main dialog (found by clicking the Data Sets icon) and you can browse the history.

Every now and then I’ve heard of people run into problems when saving a data set. And if you’re dealing with really large data sets, clicking “save” and finding you’ve lost all your data is a pretty darn serious bug. But not being able to reproduce it, I decided that this would be a nice interim fix – as well as provide some good additional functionality.

I won’t be updating the public website for a little while longer yet. I’d like this feature to be out in the wild for a few months to confirm everything works as expected.

Enjoy! :D

Read More

generatedata 3.2.0

Posted on Jan 29, 2015 in generatedata.com, Open Source Projects | 6 comments

I just released generatedata.com 3.2.0, which includes a long-awaited feature: a REST API to allow programmatic generation of data sets. Yay! Great to see this sucker finally out the door.

Note: this new feature won’t be added to the public site – it’s intended for your own installations of the script only. The main generatedata.com offers a service where you can donate and get an account on the site. This is really just intended as a quick convenience and a way for people to contribute to the script. The downloadable version has all the functionality – and now more.

You can find the latest 3.2.0 tag on the github repo, and the new API functionality documented here. Enjoy! :)

Read More

generatedata API

Posted on Jan 18, 2015 in generatedata.com, Open Source Projects | 4 comments

Howdy all! A couple of weeks ago I decided to take on a feature for generatedata.com that I’ve been meaning to add for (literally) years: a REST API to allow programmatic generation of datasets, rather than forcing people to use the UI. This has always been something of a white whale for me. Initially I couldn’t see a way to solve it, then when I did, I always had a long list of work that needed to get done first.

But enough is enough! Today I got the core code working and a nice proof of concept in place: the Names Data Type is now working in conjunction with the JSON Export Type to generate random names in JSON via a REST endpoint. Pretty cool.

This should be a staggeringly useful feature and I think it’s conceptually pretty cool, so I’ll take the time to explain it here.

The problem

I designed the Data Generator to be modular so it could be used to generate any sort of data you want: text, numbers, strings, images, silly cat pictures – really anything. Check the developer section of the documentation for an explanation of all that. The problem with this design was all the little pieces of the code were separate entities, had their own configuration settings and generated different things. To add a REST API meant exposing all the functionality offered within the current UI to developers, and that meant making the options within the UI well-defined.

The solution

So here’s what I settled on. The two work-horse module types: Data Types (the type of data being generated) and Export Types (the format in which the data was generated) now define themselves via a JSON schema file (using the json-schema.org site and spec). That lists the name and structure of all generation options for that module. It says what fields are optional and what’s required, and what field types (string, boolean, etc.) each setting is.

For example, let’s look at the Names Data Type. That module is used to generate human names – first names, last names, male, female names, initials, surnames etc. The schema file looks like this:

1
2
3
4
5
6
7
8
9
10
11
{
   "title": "Names",
   "$schema": "http://json-schema.org/draft-04/schema#",
   "type": "object",
   "properties": {
      "placeholder": {
         "type": "string"
      }
   },
   "required": ["placeholder"]
}

Kinda readable! Even coming in fresh. If you look at the generatedata UI and select “Names” for a row, this maps exactly to the options offered in the “Options” column. For this Data Type there’s only one option: a string that contains placeholders, which are switched out for names during data generation.

Now, to use the API, developers need to POST JSON content to a specific endpoint (http://yoursite/generatedata/api/v1/data) in a particular format. Here’s a simple example that generates 100 names of the format “Beth R. Mackenzie”:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
{
  "numRows": 100,
  "rows": [
    {
      "type": "Names",
      "title": "A full name",
      "settings": {
        "placeholder": "Name Initial. Surname"
      }
    }
  ],
  "export": {
    "type": "JSON",
    "settings": {
      "stripWhitespace": false,
      "dataStructureFormat": "simple"
    }
  }
}

Conceptually it’s super simple. The contents of the settings object in the rows array always contains whatever settings are relevant for that Data Type. Similarly with the settings object in the export object below. So for the rows property, you define an array, each index of which is a data type containing whatever settings you want. Here’s an example that generates two rows of Names, in different formats:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
{
  // ...
  "rows": [
    {
      "type": "Names",
      "title": "A full name",
      "settings": { "placeholder": "Name Initial. Surname" }
    },
    {
      "type": "Names",
      "title": "A female name only",
      "settings": { "placeholder": "Name" }
    }
  ],
  "export": {
    // ...
  }
}

I’ve also added some nice clear error handling that tells you exactly what’s wrong, which should prove invaluable for debugging.

In the coming days I’ll be finishing off the code and working on the documentation. As I said, now I’ve gotten the proof of concept going it should be straightforward to complete for here on out. Keep you posted. :)

Read More