190
I sometimes get files from my clients that have the wrong file extension. For example, the name is image.jpg
but the file is actually a TIFF image. In many cases I can clarify it by opening the file in a text editor, looking at the first few bytes, then deducing which file type it is.
This works for me with JPEG, TIFF, GIF and PDF files. However there are many more file types out there.
Is it possible to automate identification of the correct file type by analyzing the containing data?
- windows
- file-management
- file-extension
Improve this question
edited Apr 12, 2018 at 15:01
Stevoisiak
15.1k4141 gold badges110110 silver badges172172 bronze badges
asked Apr 24, 2011 at 7:36
MartinMartin
3,98233 gold badges2424 silver badges2525 bronze badges
6
-
52
For those interested the
file
command does this on *nix machines.–boehj
Commented Apr 24, 2011 at 12:37
-
13
I do not understand why this question is off-topic (after 3 years). I do not ask for a specific software (i reworded my question to underline this). I just aks for a solution.
–Martin
Commented Dec 22, 2014 at 10:13
-
5
I don't understand why 26 people think that boehj *nix-related comment above "adds something useful to the post". This question is tagged
windows
, but the comment imply: "You can't do that on Windows, you must use *nix instead". So? The comment is directed "for those interested". In what? Change computer?:(
–Aacini
Commented Sep 8, 2015 at 14:47
-
6
@Aacini useful for *nix people who come here from google.
–jingyu9575
Commented Nov 14, 2015 at 14:46
-
6
@Aacini Also, Windows 10 now supports bash, so
file
is now a valid answer to this question (although I haven't tested it).–ThatMatthew
Commented Aug 15, 2017 at 13:30
| Show 1 more comment
6 Answers
Reset to default
181
You can use the TrID tool which has a growing library of file type definitions for identifying files with.
Wildcards are supported, so in your example you could just put all the images to be examined in a folder, e.g. C:\verifyimages - then you can use the command:
trid C:\verifyimages\*
This will examine all files in the verifyimages
folder.
There is also a GUI version available, TrIDNet:
There is documentation available on how you can you can easily integrate TrID or TrIDNet into Windows Explorer and Total Commander:
Windows Explorer
Total Commander
Improve this answer
edited Nov 2, 2016 at 14:13
answered Apr 24, 2011 at 7:47
GarethGareth
19k1515 gold badges5959 silver badges6969 bronze badges
4
-
8
Do note that it indicates it is not licensed for commercial use, only personal use
–Chris Magnuson
Commented Jan 31, 2015 at 17:31
-
6
I had some trouble figuring out which download files were necessary to use this program. So this comment is to aid in that. You'll need to download two files. First, either the command line utility or the GUI utility. Second, a folder of XML definitions called "TrID XML defs". Place the definition XML files in the same directory as TrID. Then scan definitions. Finally you can start using it.
–mrtsherman
Commented Mar 26, 2015 at 15:40
-
Thanks, mrtsherman, for the clarification. I was confused as well. Docs could be improved, but nice tool!
–Woodchuck
Commented Oct 6, 2018 at 17:26
-
1
Add a comment |
58
File tests each argument in an attempt to classify it. There are three sets of tests, performed in this order: filesystem tests, magic number tests, and language tests. The first test that succeeds causes the file type to be printed.
The type printed will usually contain one of the words text (the file contains only printing characters and a few common control characters and is probably safe to read on an ASCII terminal), executable (the file contains the result of compiling a program in a form understandable to some UNIX kernel or another), or data meaning anything else (data is usually “binary” or non-printable). Exceptions are well-known file formats (core files, tar archives) that are known to contain binary data.
Improve this answer
edited Oct 24, 2014 at 14:38
G-Man Says 'Reinstate Monica'
7,9682222 gold badges4242 silver badges8989 bronze badges
answered Apr 24, 2011 at 7:38
Ignacio Vazquez-AbramsIgnacio Vazquez-Abrams
114k1111 gold badges207207 silver badges251251 bronze badges
9
-
1
file
is standard, but on older systems (especially non-Linux) not very knowledgeable. For Ubuntu etc it should be quite respectable and even installed as standard.–Thorbjørn Ravn Andersen
Commented Apr 24, 2011 at 13:28
-
2
I very much doubt that
file
is an expert on files made by Windows applications.–Robin Green
Commented Apr 24, 2011 at 20:23
-
6
@Robin: You're welcome to test it.
–Ignacio Vazquez-Abrams
Commented Apr 24, 2011 at 20:27
-
13
@Robin: I very much doubt you've used
file
at all, and yet you've almost made up your mind about its effectiveness.–tzot
Commented Apr 24, 2011 at 23:24
-
2
@Gqqnbig, that version of
file.exe
is a decade old and overall status ofgnuwin32
isunmaintained
since 2013 as per Wikipedia. The modern approach is to use git-for-win: git-scm.com/download/win, that bundles Unix utilities (latest versions). After installation, you should have%ProgramFile%\Git\usr\bin
inPATH
withfile.exe
in it. For Windows 10 you may also enable Windows Sybsystem for Linux (WSL), install a distro of your choice (ubuntu, fedora, alpine, gentoo, etc.), enter it and dofile /mnt/c/your/path/in/windows/filename.extension
(/c/
part represents C: drive).–vulcan raven
Commented May 25, 2019 at 10:57
| Show 4 more comments
16
I used to work for the French National Library, to build an digital archive system that contains not only digitized books but also millions of digital artefacts with all kinds of strange file types. We used JHOVE to recognize file formats.
JHOVE is open source, it is maintained by JSTOR and the Harvard University Library. It is rather simple to use.
Improve this answer
edited Jul 20, 2020 at 0:21
answered Apr 24, 2011 at 13:16
Nicolas RaoulNicolas Raoul
11.4k2020 gold badges6969 silver badges105105 bronze badges
3
-
cool! but does it recognize proprietary formats like TrID does? anyways, I do have some uses to identify subformats/variants of non-proprietary formats (or, to be precise, proprietary 'extensions' to standardized formats), so this would come in handy. thank you for the heads-up!
–pepoluan
Commented Apr 24, 2011 at 14:00
-
It doesn't have support for most formats. Imho pretty useless tool, if you don't want to identify any of it's few known file types.
–crash
Commented Sep 19, 2023 at 12:52
-
I know this answer is 13 years old by now, but even 13 years later this software still only supports identifying 17 file formats, whereas the tool TrID listed above currently identifies 18,234 file types. It's worth noting that the two tools serve different functions. TrID exams the "magic number" of the file (first couple bytes which identify filetype). JHOVE will examine the formatting, so where TrID sees an ASCII/Text file JHOVE might see it is an .xml or .html file.
–ShaneB
Commented Oct 3 at 17:04
Add a comment |
12
A modern approach that may appeal is to use Git for Windows. Run git-bash.exe
and run the command file path\to\file
. An example output might be:
TestFile.ico: MS Windows icon resource - 1 icon, 128x128, 32 bits/pixel
Alternatively, use the command file -i path\to\file
, which might give:
TestFile.ico: image/vnd.microsoft.icon; charset=binary
Improve this answer
answered Jul 19, 2020 at 20:31
AlainDAlainD
4,9872222 gold badges6262 silver badges111111 bronze badges
1
-
1
Thank you! At least I can get the mime types and construct an
mv
batch to fix the file extensions. Perhaps someone with more time can automate the process into a program :)–ADTC
Commented Mar 23, 2021 at 17:22
Add a comment |
3
You can check the file type from any computer including windows at
Improve this answer
answered Jun 4, 2018 at 11:24
John WilliamsJohn Williams
3911 bronze badge
1
-
3
Welcome to Super User! Please read how to recommend software in answers, particularly the bits in bold; then edit your answer to follow the guidelines there. This applies even though you are recommending a website! Cheers
–bertieb
Commented Jun 4, 2018 at 11:33
Add a comment |
2
I use Oracle's OutsideIn libraries in my programs. Not free, but they work well, especially for images. The market-speak says it supports over 500 file types.
Improve this answer
edited Sep 20, 2020 at 4:24
answered Apr 24, 2011 at 11:30
Richard BrightwellRichard Brightwell
15944 bronze badges
Add a comment |
You must log in to answer this question.
Not the answer you're looking for? Browse other questions tagged
- windows
- file-management
- file-extension
.
Not the answer you're looking for? Browse other questions tagged
- windows
- file-management
- file-extension
.