Texas Assessment of Knowedge and Skills (TAKS Exams) Single Year Data 2007

Added By Infochimps

Texas Assessment of Knowledge and Skills (TAKS) Scores for one year by Student, Grade and School.

Student IDs are anonymized, but consistent from year to year. Some of the data is scrubbed to comply with NEPA privacy regulations.

The Texas Assessment of Knowledge and Skills (TAKS) is a standardized test used in Texas primary and secondary schools to assess students’ attainment of reading, writing, math, science, and social studies skills required under Texas education standards1. It is developed and scored by Pearson Educational Measurement with close supervision by the Texas Education Agency. Though created before the No Child Left Behind Act was passed, it complies with the law. It replaced the previous test, called the Texas Assessment of Academic Skills or TAAS, in 20032.


The TAKS data comes in three varieties: student level data, campus
level data, and data about tests. For all three types, data is
gathered annually as another round of examinations are administered to
Texas students.

Each type of data consists of records with identical structure. In
the original TAKS data, as received from the Texas Education Agency
(TEA), this was not the case. Campus data in some years would have
columns differing from campus data in other years.

The data you’re working with is a normalized subset of the data given
out by the TEA. For each year, only the columns of the data (student,
campus, or test) that are common to all years were included. Almost
all the interesting columns span the full period of the dataset so
this constraint essentially removed oddball columns. For more
information on the original data, contact the TEA directly.

Student Data

  • FERPA Masking

Student level data is not freely given out by the TEA. It was
originally obtained by Dhruv Bansal ( via a
Freedom of Information Act Request and cost hundreds of dollars.

In accordance with FERPA guidelines, the data was masked by the TEA in
such a way if one can construct a query which might single out a group
of students numbering five or less (and one can, easily) then the
scores from these students will be set to NULL.

This masking, though inconvenient for a researcher, is intended to
protect the privacy of the individuals comprising this data. It is
because of this masking that the TEA has authorized the general
release of this data online.

  • Normalization

Each student in Texas is granted a unique identifier by the TEA which
is used to track that student through examinations. Students may sit
for examinations multiple times in the same grade. The result of each
such set of examinations is a row in one of the original files
produced by the TEA — while the student may have sat for only the
mathematics exam, this row will also contain (blank) columns for all
other subjects as well.

The row will also contain repeated information about tests that is
actually common to all students who took a particular examination.

These problems have been eliminated in the data you are working with.
Each student has one row per grade which contains the best results
recorded for the student across all examination sessions. Redundant
information about tests has been moved to the tests file (see below).

Campus Data

Campus level data is actually available directly from the TEA because
there are fewer privacy concerns than with student level data.

What the TEA hands out is in a very messy format with the campus
properties changing willy nilly from year to year.

The data you are working with has been cleand so that only the common
columns are kept across the years (this winds up being most of them).
The schools were also geocoded with latitude and longitude.

The campus data is, unfortunately, not as complete as might be hoped
for; many of the files the TEA makes available are missing data.

Test Data

Test data summarizes the common characteristics of each test that was
taken by the students of Texas.

Test data was originally intermingled with student data, each row of
which having columns like H_IKEY and M_IKEY, representing the correct
answers to the history and mathematics exams, respectively. These
data were extracted and pulled into a single tests’ file.