I just thought I'd summarize an issue I'm having with spotlight since upgrading to 10.6 in the hope that it saves someone a little frustration and time. My symptoms are, since upgrading, spotlight indexing is VERY slow and disks with spotlight enabled were rapidly filling.
The problem turns out to be a change in how mdimporter detects file types. In earlier versions of OS X, the file type was determined largely by the extension or file attribute. So, a simple text file with an unknown/missing extension and no attribute was not indexed by spotlight. You could search by name, but not content. In the terminal, you see:
-----
> echo '0.2110 1.0652 0.1857 1.0848 -1.2509 -0.5182 0.6170 -1.4375 1.4643 -1.9356' > foobar.ccc
> mdimport -pn -d 2 foobar.ccc
2009-10-29 12:09:40.570 mdimport[26246:10b] Imported '/Users/user/foobar.ccc' of type 'dyn.ah62d4rv4ge80g25d' with no plugIn.
2009-10-29 12:09:40.572 mdimport[26246:10b] Attributes: {
"_kMDItemFinderLabel" = <null>;
"com_apple_metadata_modtime" = 278525373;
kMDItemContentCreationDate = 2009-10-29 12:09:33 -0400;
kMDItemContentModificationDate = 2009-10-29 12:09:33 -0400;
kMDItemContentType = "dyn.ah62d4rv4ge80g25d";
kMDItemContentTypeTree = (
"public.data",
"public.item"
);
kMDItemDisplayName = {
"" = "foobar.ccc";
};
kMDItemKind = {
"" = Document;
da = Dokument;
de = Dokument;
es = Documento;
fi = Dokumentti;
it = Documento;
ja = "\U66f8\U985e";
ko = "\Ub3c4\Ud050\Uba58\Ud2b8";
nb = Dokument;
pl = dokument;
pt = Documento;
"pt-PT" = Documento;
ru = "\U0414\U043e\U043a\U0443\U043c\U0435\U043d\U0442";
sv = Dokument;
"zh-Hans" = "\U6587\U7a3f";
"zh-Hant" = "\U6587\U4ef6";
};
}
Total processing time spent in importer plug-ins: 0.049228 seconds for 1 files
Top 1 most expensive files for importer plug-ins:
1 /Users/user/foobar.ccc: 0.049228 seconds
-----
Doing the same in snow leopard 10.6.1 results in:
-----
> echo '0.2110 1.0652 0.1857 1.0848 -1.2509 -0.5182 0.6170 -1.4375 1.4643 -1.9356' > foobar.ccc
> mdimport -pn -d 2 foobar.ccc
(Info) Import: magic_file returned "ASCII text" for path "/Users/user/foobar.ccc" of type "dyn.ah62d4rv4ge80g25d"
(Info) Import: magic identified contents of file at path "/Users/user/foobar.ccc" as text; using text importer
(Info) Import: Import '/Users/user/foobar.ccc' type 'dyn.ah62d4rv4ge80g25d' using '/System/Library/Spotlight/RichText.mdimporter'
2009-10-29 12:11:24.280 mdimport[839:903] Imported '/Users/user/foobar.ccc' of type 'dyn.ah62d4rv4ge80g25d' with no plugIn.
2009-10-29 12:11:24.284 mdimport[839:903] Attributes: {
":MD:kMDItemSeedLastUsedDate" = 1;
"_kMDItemFinderLabel" = "<null>";
"com_apple_metadata_modtime" = 278525476;
kMDItemAuthors = "<null>";
kMDItemComment = "<null>";
kMDItemContentCreationDate = "2009-10-29 12:11:16 -0400";
kMDItemContentModificationDate = "2009-10-29 12:11:16 -0400";
kMDItemContentType = "dyn.ah62d4rv4ge80g25d";
kMDItemContentTypeTree = (
"public.data",
"public.item"
);
kMDItemCopyright = "<null>";
kMDItemCreator = "<null>";
kMDItemDisplayName = {
"" = "foobar.ccc";
};
kMDItemEditors = "<null>";
kMDItemKeywords = "<null>";
kMDItemKind = {
"" = Document;
de = Dokument;
en = Document;
fr = Document;
};
kMDItemOrganizations = "<null>";
kMDItemSubject = "<null>";
kMDItemTextContent = "0.2110 1.0652 0.1857 1.0848 -1.2509 -0.5182 0.6170 -1.4375 1.4643 -1.9356\n";
kMDItemTitle = "<null>";
}
Total processing time spent in importer plug-ins: 0.210648 seconds for 1 files
Top 1 most expensive files for importer plug-ins:
1 /Users/user/foobar.ccc: 0.210648 seconds
-----
Note that in 10.6.x mdimport 'magically' determines that the file is ASCII and uses RichText.mdimporter to parse it (see the kMDItemTextContent field). Also note that in 10.5.x mdimport took 0.05 seconds to parse the file while in 10.6.x it takes 0.2 seconds.
Now this might not seem like a big deal and it will certainly help you search for text in any random ascii files you have on your hard disk. However, if you have ascii data files with numeric values instead of words, it is a bit of a problem. In my case, spotlight is essentially trying to indexing a large unbound set of noisy floating point numbers and failing miserably (as one might expect). The result of this small change is that indexing is now slow and my spotlight index grows until my a disk full kernel warning kills mds/mdsworker. On the upside, it sure is fun to see how many data files I have that contain 1.42802...
(this is also described in http://forums.macosxhints.com/showthread.php?t=106703)
| Forum legend: | |
| Locked thread | |
| Moderator | |
![]() |
CNET staff |
![]() |
Samsung staff |
| Norton Authorized Support team | |
| AVG staff | |
| Windows Outreach team | |
![]() |
Dell staff |
| Intel staff | |