Preparing text for pencil and paper or computer-base QDA (qualitative data analysis) requires some thought and planning.  It is also possible to use audio files directly as sound or video (no transcription), or to transcribe audio or video using software products.  (It is not yet possible to have software make text transcripts of recordings - unless it is of only one speaker.  This means that human transcription is needed to generate text from audio or video for dyads or groups - the bulk of most research work - and it takes time or money.)

Some general notes about online qualitative data analysis, including transcription and data preparation, are offered by Ann Lewins and colleagues at the University of Surrey, as well as colleagues from the University of Huddersfield. This is a neat site detailing an approach one set of researchers have used and found helpful.

I would add that there are no "standard" forms of transcription - Jeffersonian is perhaps the closest but does not seem suitable and economical for all uses.

On a more abstract level Allen Renear of Brown University examines the theory and metatheory of text encoding.

There have been "threads" on this topic in that ATLAS.ti forum (and also a bit on video).


May I suggest:  Code all recordings with a name of letters and numbers so as to disguise the participant's names.  You may, however, want to state plainly at the top of the transcript who was the interviewer, transcriber, etc.  Adding the date is always good and helps make a list of what was done when.  Using codes may be required by some IRBs to protect privacy.  (I usually start my audio recordings by re-stating that the participants have understood and agreed to participate after reviewing the consent documents.  This puts the consent to participate on tape as well as on paper.  I do not have this transcribed.)

You should have a signed statement of confidentiality from all transcribers.  This signed document holds them to maintaining confidentiality of the information they transcribe.  This, too, is often IRB required.

Have the transcriptionist remove identifying names and replace all occurrences with a code, such as P1 for the primary participant and P2, etc.  The interviewers should be identified similarly as I1 and I2, etc.  A general key to the codes can be placed at the start of the interview transcript.

Leave a blank line between turns (he said, she said), but single space the text. This saves space and displays the turn taking.

Use large side margins.

Be sure to date the work!

If you use a list of questions, you may wish to embed them for searching later.  Often it is enough to make the question so long as the response follows directly.

These days it is fine to keep text as Word documents or WordPerfect documents.  However, ASCII text or RTF format may be required by certain computer programs (make sure you check this out early on).  Format conversion is easy - just "Save As" in the new format with the same name using your word processor.


First, the hardest step.  You need to find an able and reliable transcriptionist.  The best route is to ask colleagues and ask for references.   Costs and turn around times are next after quality and reliability.  I received an email from a transcriptionist who viewed this page stating: "Typically it takes 3-5 hours to transcribe 60 minutes... I charge $100 per 60 minutes of clear recording of one or two people with little or no accent (prices go up if there are more people, accents, or poor recording)."  This is about $25 an hour of work time - but $100 of recorded time.  I have also heard there is great variation in cost - which is usually higher in large urban areas and more varied elsewhere. 

INo transcriptionist is perfect.  (Do compare some of the earliest transcribed material you receive back against the tape to assess quality before too much time and money has been invested with a new transcriptionist.)  You do need to edit and proof their work.  This takes time and effort, but is worth the effort.  One learns through re-hearing - and by the random fragmenting of text that occurs in transcription.

Actual run times in minutes and seconds (from the tape start, such as 21:49) are better than the arbitrary numbers to locate sections of tape - if available on your transcriptionist's equipment. 

Backup electronic media, and always take the tabs off the back of cassettes to safeguard your work.  This stops machines from recording over your work by mistake.  Duplicating tapes reduces quality from the original and is probably overkill unless the content is extremely valuable.  (By contrast, with computer files, it is easy.)


Get a transcription machine (or transcriber) with a foot switch (also called a treadle switch).  I have had good luck with a Panasonic cassette transcribing machine.  These products run about $200.00.  Make sure the transcriber/recorder has a counter or timer to mark your place on the tape. 

Alternately, use a cassette tape machine with a foot pedal, though this may be hard to find. 

Note: You don't need this equipment if your transcriptionist can access a digital computer file.  Check with your transcriptionist before buying equipment if possible.


For transcription of audio tapes and written documents, I use Dragon Naturally Speaking Preferred.  (about $200) This software allows you to speak into a headset microphone and to get about 98% accurate text in formats easily converted for use with QDA software (or for direct import with many software packages).  The latest version, Dragon 9, does not require "training" before use. Coupled with a USB microphone it works quite well, but not quite perfectly.  On everyday speech it is great, technical terms and names less so.   (The USB microphone connection by-passes your computer's sound card and generally yields better input sound quality than non-USB systems.  I use a Telex H-841 which is light and adjustable and high quality.  $40 US online.)

I am not impressed with the Microsoft speech to text software that comes with some versions of Windows XP and Vista.

Pros and Cons of Speech to Text Software: 1) It's not a panacea.  It takes time and work (but so does any form of transcription - including verifying the work you paid $40 to $80+ an hour for is really accurate!)  Still, it may be advantageous for you.  It is a real bonus for people with disabilities or hand and wrist problems.


Have a fairly new computer.  (Sorry, I am only competent to discuss PC's, I have no experience with Macs.)  Voice recognition takes a modern computer.  However, most from the past few years will do fine (if not economy models).  But, don't even try this with less than a Pentium III 500 CPU and 512 Megs of memory.  Optimally, buy the latest Intel or AMD CPU with lots of memory.  (Hard disk size is not an issue on any recent computer.)  Why a new computer?  This is because speech recognition is very computer intensive -- it makes any computer work hard.  Pentium III chips and all newer models are optimized to work on speech recognition.  AMD Athlons work fine too - though Intel chips are often better optimized for audio work.  (Older Celerons, Pentiums, Pentium IIs and Durons are not so optimized.)   You also will need USB ports, which are common in computers built after 1999 and require Windows 98, Me, NT, or XP.  Such a computer can be expensive ($600 and up) but is crucial to your success. 

Use a Telex (or similar) USB headset.  These microphones plug into the USB port of recent computers.  Their benefit is that they bypass the soundcard in your computer.  While your soundcard may make great music, its circuitry is not optimized for high quality microphone input.  The USB microphone in the Telex headset bypasses your soundcard and the electronic "noise" it introduces, producing better voice recognition.  Several models are now available.  These cost about $50 to 75.00 [US].  The Telex H-841 is a fine choice.  The headset that comes with the Dragon software is also fine.

Work in a quiet environment.

Do the required setup and optional voice training for the voice recognition software.  Then add to this base vocabulary and sense of your voice pattern by running the vocabulary builder several times.  This process allows you to include several of your written documents in the program's voice pattern knowledge base.  Practice by doing some letters or memos.  Only then try a transcription from tape or disk.

Getting good results requires some time and effort.  You need to use the software and train it to get optimal results.  Don't be discouraged early on - things improve.

HOW TO TRANSCRIBE AUDIO:  1)  Start your computer and voice recognition software.  

Cassettes:  Set up your transcriber near your computer.  Put the treadle at your feet, under the keyboard.  Be sure you can sit comfortably and see the monitor.  I use a cheap in-the-ear earphone for output from the transcriber.  Press the play button on the transcriber, then, by pressing the foot switch, you start and stop and playback of the data tape.  

Computer Files:  You can use Winamp (an audio player available for free download) to play .mp3 files, but must mouse click to start and stop the playback.  Alternately, try Transcriber, described below, or f4.

Play a bit of audio, pause it, type and then play the next bit.  Takes time but is free.  f4 software can be used with a foot pedal, the transcriber uses the mouse.

Listen to a segment (about a sentence), and then speak it into the voice recognition headset.  The text appears on the screen (allowing any editing if it is needed).  

Repeat the sequence of listening to a bit of the tape, followed by speaking the text you just heard into the voice recognition headset.

Save the data file often.  (Check the file format of any software you plan to use - some are very restrctive.  Use the word processor to save the material in the required format - via the Save As feature)

Yes, it is a bit cumbersome, and you should not try to stand up quickly, but it works well.  

Pros and Cons:  1)  It's expensive.  Yes.  Still the transcribing equipment can be used by others, too.  As for cost, transcribing an hour of text will cost from $40 to $80 if done by a paid typist.  This adds up over a modest number of interviews.  Paying others is expensive, too.

2)  It's time consuming.  Yes.  No question.  But doing a quality check on paid transcriptions requires listening to each tape in real time, plus more time when you make corrections.  So someone should be spending at least a hour and fifteen minutes to proof the transcription.  If the research does the proofing or the voice entry, your time gets you closer to the data.  In fact, breaking the text up into small units gives you a somewhat different "take" a fluid listening.  You can review unclear sections (and there are always some words you can't make out), and you basically proof as you go.  I find it takes between three and four hours per hour of tape, but I am well prepared for analysis at the end.  (I often make notes for unexpected codes as I transcribe -- but these get reviewed and re-assessed later.  

3)  Speaking the participant's views may add some perspective and depth.

4)  You get to add contextual notes or memos about contextual issues. A paid transcriber will likely get you just the "facts."   As researcher, you may add memos about affect, prosody, inflection, events co-occurring with content (walkthroughs, faces made by children, etc).  This too may add richness and help keep data in context.

Qualitative research is time intensive.  Listening to tapes requires the same time - or more - than did the original interaction that generated the tape.  In the time=money equation it is inherently expensive.  The trade-off here is your comfort with a new technology and its equipment versus paying others for transcription and then having to check for accuracy and completeness.  With grant funding, paying for transcription may be relatively easy, but proofing remains time consuming and offers a less extensive engagement with the raw data.

TRANSCRIBING VIA COMPUTER  (An alternative to using a separate transcribing machine)

To transcribe, a free software download, Transcriber, makes your into windows pc a transcribing machine.  Transcriber allows you to play the audio file using mouse-activated controls that look like a cassette or cd player.  You play a section, pause, then type what you have heard.  Not only do you get the text, Transcriber links the text to the audio file.  That is, when you highlight a line, the audio of the line plays audibly.  It's no slower than other approaches and you get a great result.  Text can be transferred to standard word processing programs if you simply "save as" .rtf (rich text file) or plain text.   The URL to download the Transcriber is

Another program also allows transcription from a computer.  It is called f4.  It is very much like the transcriber, works with a foot pedal but does not show the audio waveform.  Nice autosave function. Free download.

Yet another free transcription software package is Express Scribe.  It, too, can be used with a foot pedal.


Transana software allows you transcribe video files in a manner like Transcriber does for audio.  Transana was created by Chris Fassancht and is maintained and improved by David Woods.  You link and video file to Transana, then play a bit, pause the playback, and type the text.  Transana has a bit of a learning curve, but works very well and is worth the effort.  (Of course, you can segment and code both video and audio in programs such as ATLAS.ti).

Transana is now available at modest cost (to pay Dr. Woods for his great work), but older version may be available as freeware.


Adobe Audition, formerly know as CoolEdit, is widely praised for editing audio.  You do this to improve sound quality as well as to segment the audio into desired sections.  

I have found the software that came along with my Roxio CD-RW unit also does a decent job of audio cleanup (hiss removal, etc.) and segmenting audio, but it is time consuming.  All work in real time plus.

The free version of Musicmatch Jukebox works fine to convert audio files from one (widely used) format to another - though not proprietary formats (such as used by some Sony and Olympus products or Apple formats).  Also great to shrink high quality files to smaller, lower quality versions.


  J Drisko 1/22/00; last update 10/23/07


