Computational & Data Journalism @ Cardiff

Reporting. Building. Designing. Informing

  • Blog
  • About Us
  • Local
  • Guests
You are here: Home / Archives for investigation

Capturing OSINT flags with Cardiff’s Cybersoc

3rd May 2020 by Aidan O'Donnell

Cardiff University’s Cyber Society gave us all its Capture the Flag challenge earlier this year and now has over a thousand players on its leaderboard, many of them sitting on the maximum score 0f 15,000.

The challenges are organised into three streams: ten introductory questions to get you warmed up, 18 tasks for online intelligence gathering and finally a dozen challenges centred around some fictional characters and their online life.

There are no pre-requisites for attempting it — it starts with a “What is OSINT?” question, so beginners are welcome — but it should test most players’ “resilience” (i.e. can you keep playing even though you’ve run out of ideas, patience and any sense that you once knew anything about online intelligence gathering?). At least one of our Computational Journalism students has made it successfully through all the challenges.

The challenges were featured by We Are OSINTCurious on its webcast in March.

Filed Under: Blog Tagged With: education, investigation, OSINT, students

SELECT * FROM a day of SQL…

6th March 2020 by Aidan O'Donnell

This month our students survived a full-day workshop on SQL, moving from the very basics of the syntax to querying datasets or working through some of the better tutorials.

First up was the excellent Select Star tutorial by Zi Chong Kao, which is based on a dataset of US prisoners executed since 1976.

We then looked for newslines in a sqlite database of US babynames (via the command line) and wrote queries in Carto to map a dataset of protected Welsh monuments.

There was more sqlite with a database of shooting incidents involving Dallas police officers, this time via a notebook. And we finished with the Knight Center’s fine SQL-based murder mystery.

Enough there to get you started (or refreshed) with your SQL syntax.

Filed Under: Blog Tagged With: coding, data, education, investigation, SQL, tools

Digital Needles in the Data Haystack

24th May 2017 by Martin Chorley

We’re presenting today at “Investigating (with) Big Data“, a one day symposium being held at Cardiff University by the Digital Culture Network. Our talk, “Digital Needles in the Data Haystack” examines the use of data by news organisations, focusing on the challenges they face when carrying out investigations with increasingly large volumes of data. We discuss the collaborations that organisations have built to get past such problems, and talk about some of the issues surrounding the use of data within newsrooms.

It looks to be an interesting day of talks on a range of different topics connected to ‘Big Data’, and we’re looking forward to it!

Filed Under: Blog Tagged With: data, engagement, investigation, talks

Scraping the Assembly

2nd November 2016 by Martin Chorley

Glyn is currently teaching the first-semester module on Data Journalism. As part of this, students need to complete a data investigation project. One of the students is looking at the expenses of Welsh Assembly Members. These are all freely available online, but not in an easy to manipulate form. According to the Assembly they’d be happy to give the data out as a spreadsheet, if we submitted an FOI.

To me, this seems quite stupid. The information is all online and freely accessible. You’ve admitted you’re willing to give it out to anyone who submits an FOI. So why not just make the raw data available to download? This does not sound like a helpful Open Government to me. Anyway, for whatever reason, they’ve chosen not to, and we can’t be bothered to wait around for an FOI to come back. It’s much quicker and easier to build a scraper! We’ll just use selenium to drive a web browser, submit a search, page through all the results collecting the details, then dump it all out to csv. Simple.

Scraping the Assembly

I built this as a quick hack this morning. It took about an hour or so, and it shows. The code is not robust in any way, but it works. You can ask it for data from any year (or a number of years) and it’ll happily sit there churning its way through the results and spitting them out as both .csv and .json.

All the code is available on Github and it’s under an MIT Licence. Have fun 😉

Filed Under: Blog, Teaching Tagged With: coding, data, foi, investigation, oss, python, scraping

Visualising data – first try to get the information #foia

10th November 2015 by Glyn Mottershead

Derelict House © Copyright David Wright and licensed for reuse under this Creative Commons Licence

© Copyright David Wright and licensed for reuse under this Creative Commons Licence

We’re currently looking at how many empty properties there are across Wales.

The students and staff are in the process of using the Freedom of Information act to get data sets as part of our sessions around data journalism and visualising data.

(Martin recently did a post looking at different technologies for wrangling the data from the first request).

We’ve applied to all 22 of the Welsh councils for the same information and are starting to get responses before the 20 day statutory limit.

And it is really interesting to see how that request is being interpreted by different councils.

So, to explain that we’ll go back to our first application – Cardiff

What do they know?

Advanced search on What Do They Now FOI site

Advanced search on What Do They Now FOI site

To keep the application and responses in public, our preferred way of working is to make the  application on What Do They Know.

This is a great resource for anyone intersted in public data, and has an advanced search facility that really helps you find what you are looking for as it uses a syntax familiar to anyone who has used the advanced Google Search techniques.

We asked for:

1 The number of
2 address (including street number and postcode) of homes that:

a) have been empty for over 6 months
b) have been empty for under 6 months
c) your empty homes strategy including what empty homes (if any)
you prioritise.

And we got it, the only issue was we got PDF but asked for Excel. Tabula is a great tool for dealing with information locked in PDF format but we just asked for the new filetype and got them.

The rest of Wales

We then applied to the other 21 authorities in Wales, and have had widely varying results.

We’d already picked up that there might be some issues, given the phrasing coming back from Cardiff, so we made sure that later applications acknowledged (and hopefully dealt with) the anticipated exceptions.

And we hit one in particular so far. Section 31(a) – crime.

Section 31(1)(a) the prevention or detection of crime

  1. Section 31(1)(a) will cover all aspects of the prevention and detection of crime. It could apply to information on general policies and methods adopted by law enforcement agencies. For example, the police’s procedures for collecting forensic evidence, Her Majesty’s Revenue and Customs procedures for investigating tax evasion.
  2. The exemption also covers information held by public authorities without any specific law enforcement responsibilities. It could be used by a public authority to withhold copies of information it had provided to a law enforcement agency as part of an investigation. It could also be used to withhold information that would make anyone, including the public authority itself, more vulnerable to crime for example, by disclosing its own security procedures, such as alarm codes.
  3. Whilst in some instances information held for the purposes of preventing or detecting crime will be exempt, it does not have to be held for such purposes for its disclosure to be prejudicial.

There is a public interest test to this exemption, so we’re off to read up on the Information Commissioner’s rulings on this as we’ve already had a few knock backs.

I will come back to the post, and the applications, when we’ve got all the responses we asked for.

Filed Under: Blog, Teaching, The Lab Tagged With: foi, FOIA, investigation

Copyright © 2023 · News Pro Theme on Genesis Framework · WordPress · Log in