Sample Database: Movies (ERD and SQL) (2024)

When you’re learning SQL or database design, it’s helpful to use other databases as a reference.

Many articles online refer to Oracle’s HR database, or SQL Server’s AdventureWorks database. These can be helpful, but often you’re looking for answers that these databases don’t help with.

Which is where this post comes in.

This post describes a sample database containing data about movies. It includes:

  • An ERD (entity relationship diagram) for the sample movie database
  • An explanation of the tables and columns
  • A download of sample data to create and populate this database
  • An example query on the database

Why is this helpful? Firstly, you can understand more about how a movie database might work.

Also, you can practice SQL against realistic data and write your own queries, both simple (how many movies has Tom Cruise been in?) and complex (which movies have Tom Cruise and Matt Damon both been in?).

So, let’s take a look at the database.

Sample Movie Database: ERD

The ERD or database design of the sample movie database is here (open in new tab, or save, to see a larger version):

Sample Database: Movies (ERD and SQL) (1)

This database stores information about movies, the cast and crew involved, where the movie was produced and by which company, and other information about movies such as the languages, genres, and keywords.

The sample data was obtained from a free online data source. It contains about 4,800 movies, 104,000 cast and crew, and thousands of metadata records such as languages and keywords.

What do all of these tables and columns mean?

While you’re here, if you want an easy-to-use list of the main features in SQL for different vendors, get my SQL Cheat Sheets here:

Table Explanations

The movie table contains information about each movie. There are text descriptions such as title and overview. Some fields are more obvious than others: revenue (the amount of money the movie made), budget (the amount spent on creating the movie). Other fields are calculated based on data used to create the data source: popularity, votes_avg, and votes_count. The status indicates if the movie is Released, Rumoured, or in Post-Production.

The country list contains a list of different countries, and the movie_country table contains a record of which countries a movie was filmed in (because some movies are filmed in multiple countries). This is a standard many-to-many table, and you’ll find these in a lot of databases.

The same concept applies to the production_company table. There is a list of production companies and a many-to-many relationship with movies which is captured in the movie_company table.

The languages table has a list of languages, and the movie_languages captures a list of languages in a movie. The difference with this structure is the addition of a language_role table. This language_role table contains two records: Original and Spoken. A movie can have an original language (e.g. English), but many Spoken languages. This is captured in the movie_languages table along with a role.

Genres define which category a movie fits into, such as Comedy or Horror. A movie can have multiple genres, which is why the movie_genres table exists.

The same concept applies to keywords, but there are a lot more keywords than genres. I’m not sure what qualifies as a keyword, but you can explore the data and take a look. Some examples as “paris”, “gunslinger”, or “saving the world”.

The cast and crew section of the database is a little more complicated. Actors, actresses, and crew members are all people, playing different roles in a movie. Rather than have separate lists of names for crew and cast, this database contains a table called person, which has each person’s name.

The movie_cast table contains records of each person in a movie as a cast member. It has their character name, along with the cast_order, which I believe indicates that lower numbers appear higher on the cast list.

The movie_cast table also links to the gender table, to indicate the gender of each character. The gender is linked to the movie_cast table rather than the person table to cater for characters which may be a different gender than the person, or characters of unknown gender. This means that there is no gender table linked to the person table, but that’s because of the sample data.

The movie_crew table follows a similar concept and stores all crew members for all movies. Each crew member has a job, which is part of a department (e.g. Camera).

Sample Data

I’ve prepared some sample data for this database. You can use this to create this database on your own computer, explore the tables, and write SQL on it.

The sample data is available for Oracle, SQL Server, MySQL, and Postgres, and is stored on my GitHub repository. Find out how to access it and load the data here:Sample Data for SQL Databases

Sample Query

With the sample data in the database, let’s take a look at some of the data in the movie table. This query shows the movie title, budget, and other attributes of the movie, sorted by the movies with the highest revenue.

SELECTtitle,budget,release_date,revenue,runtime,vote_averageFROM movieORDER BY revenue DESC;

Results (top 20 rows only):

titlebudgetrelease_daterevenueruntimevote_average
Avatar2370000002009-12-1027879650871627.2
Titanic2000000001997-11-1818450341881947.5
The Avengers2200000002012-04-2515195579101437.4
Jurassic World1500000002015-06-0915135288101246.5
Furious 71900000002015-04-0115062493601377.3
Avengers: Age of Ultron2800000002015-04-2214054036941417.3
Frozen1500000002013-11-2712742190091027.3
Iron Man 32000000002013-04-1812154399941306.8
Minions740000002015-06-171156730962916.4
Captain America: Civil War2500000002016-04-2711533044951477.1
Transformers: Dark of the Moon1950000002011-06-2811237469961546.1
The Lord of the Rings: The Return of the King940000002003-12-0111188889792018.1
Skyfall2000000002012-10-2511085610131436.9
Transformers: Age of Extinction2100000002014-06-2510914050971655.8
The Dark Knight Rises2500000002012-07-1610849390991657.6
Toy Story 32000000002010-06-1610669697031037.6
Pirates of the Caribbean: Dead Man’s Chest2000000002006-06-2010656598121517
Pirates of the Caribbean: On Stranger Tides3800000002011-05-1410457138021366.4
Alice in Wonderland2000000002010-03-0310254911101086.4
The Hobbit: An Unexpected Journey2500000002012-11-2610211035681697

Conclusion

So that’s the sample database for movie information. There’s an ERD you can use to help you understand it or to design your own. You can also download the sample database tables and data to run your own queries on it.

While you’re here, if you want an easy-to-use list of the main features in SQL for different vendors, get my SQL Cheat Sheets here:

Sample Database: Movies (ERD and SQL) (2024)

References

Top Articles
From bitter rivals to Olympic teammates, how LeBron and Steph Curry became friends
Why is everyone on TikTok suddenly obsessed with demure? A very mindful explainer of the trend
How To Use Scarabs Poe
Chokenigg*s
1977 Hit For Elo Wsj Crossword Clue
What Are The Hours Of Chase Bank Today
Teacup Yorkie For Sale Up To $400 In South Carolina
Frank 26 Forum
Ups Cc Center
Study Restaurants Near Me
Dr Manish Patel Mooresville Nc
Qr 0738
Hannaford Weekly Flyer Manchester Nh
Smash Ultimate's 2nd Official Tier List - Luminosity
Ky Fl Basketball Game Today
Lab-grown 'mini-guts' link Crohn's disease severity to epigenetic changes - DSSJ
Cheley Packing List
State Road 38 Garage Sale Indiana 2023
Is Bekah Birdsall Married
2004 Toyota Corolla Fuse Box Location
F95Zone.toi
Bustednewspaper Smith County Tx
H1889 007 04 - Local Ppo
Craigs List Corpus Christi
Bfg Straap Dead Photo Graphic
Understanding the Brand Architecture of Proctor & Gamble (P&G)
Angie Varona - Everything You Wanted To Know (2022 Update) - Ned Hardy
Rhiel Funeral Durand
Artmusekitsmikash Rtic Divider/Cutting Board For 65 Gallon Rtic Coolers
Ft86 Club
South Coast Plaza: A Can’t Miss California Shopping Destination
Virement et prélèvement de la DRFIP : qu'est ce que c'est?
World of Warships: Aslains Modpack - Alle Mods in einem Paket
logo!: #ThatGirl: Hilfe, muss ich perfekt sein?!
Rage Room Longmont
Popeyes Login Academy
My Gluten Free Vegetable Spring Rolls Recipe (low FODMAP, dairy free)
M3Gan Showtimes Near Century Arden 14 And Xd
Www Publix Org Oasis Schedule
Biolovematch
Bbq Near Me Open Late
Angela Sebaly Obituary
Odfe Login
M3Gan Showtimes Near Ipic Hudson Lights
Gle Outage Map
Lenscrafters Westchester Mall
Retro Bowl Slope Unblocked Games
Workstation. Scentsy.com
Snohomish Hairmasters
Xxn Abbreviation List 2023
The Complete list of all Supermarkets in Curaçao  | Exploring Curaçao
Latest Posts
Article information

Author: Madonna Wisozk

Last Updated:

Views: 6571

Rating: 4.8 / 5 (48 voted)

Reviews: 95% of readers found this page helpful

Author information

Name: Madonna Wisozk

Birthday: 2001-02-23

Address: 656 Gerhold Summit, Sidneyberg, FL 78179-2512

Phone: +6742282696652

Job: Customer Banking Liaison

Hobby: Flower arranging, Yo-yoing, Tai chi, Rowing, Macrame, Urban exploration, Knife making

Introduction: My name is Madonna Wisozk, I am a attractive, healthy, thoughtful, faithful, open, vivacious, zany person who loves writing and wants to share my knowledge and understanding with you.