Content Query - across sites

A previous post indicated about using search to aggregate data. Whilst an acceptable approach (though with some limitations?) another is to go down the route of providing a custom Data Source


Todd Baginski & Andrew Connell have presented and posted information on how to do this - the Content Monster Web Part. don't be put off by the name (those who have seen Todd present can appreciate the left field naming!)

The article has been posted on MSDN and the slide desk from TechEd also

Be the first to rate this post

  • Currently 0/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5

64 bit PDF iFilter

I'm a bit late posting this, but as a reminder, Adobe has now released the 64 bit iFilter.

As a reminder, the iFilter is needed to enable the search engine to crawl and index content within PDF files.

Download the iFilter

Be the first to rate this post

  • Currently 0/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5

Crawling Exchange 2003 Public Folders on SBS 2003 from Moss 2007

Setting this up for the first time was a little challenging. I couldn't figure out why no matter what combination of settings and credentials I gave the search service it wouldn't crawl the public folders. I tried crawl rules, complex URLs, content access accounts, but eventually I gave up focusing on SharePoint options and started to look more closely at what was happening at the other end - what was exchange doing?

I had already checked the IIS settings for the public virtual directory and it showed that Basic and Integrated Windows Authentication were both enabled, so I next tried to hit the public folders URL with it in my local intranet zone so that windows would pass through my credentials automatically - just like we set up SharePoint all the time. Anyhow, I realised that no matter what I did with the IIS settings I couldn't get to the page without first entering my credentials on a form based login for OWA. I googled some more to find that although I was setting the authentication options in IIS, there are some additional settings in Exchange System Manger.

If you open ESM, then expand Servers, <server name>, Protocols, HTTP, you'll find the exchange virtual server, if you right click on the virtual server and select properties on the second tab there is an innocuous little box that says "Enable Forms Based Authentication".

So it didn't matter what I did in IIS Manager because it was overridden by the settings here. Well, someone helpfully pointed out in a forum that you can in fact create a second virtual server and set that to work without Exchange FBA. Yay! That's what I need, our existing users can keep their interaction the same on the current URLs, we'll create a new virtual server and set that not to use Exchange FBA, just Windows Integrated authentication and hopefully our crawl will work fine.

So the crawl has reached the end and 18,224 items are indexed and searchable in milliseconds. Lovely. I just need to make sure I've put all the settings back how they should be (you did make a note of all those changes you made as you fiddled in IIS, Exchange & SharePoint didn't you?) and once I'm happy that the URLs are accesible in the right places the job's done.

Currently rated 5.0 by 1 people

  • Currently 5/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5

Caught out by the loopbacks again!!!!

Building a server today, I got caught by the loopback problem.

The search crawl logs don't index content and give a misleading

<message>

'Access is denied. Verify that either the Default Content Access Account has access to this repository, or add a crawl rule to crawl this repository. If the repository being crawled is a SharePoint repository, verify that the account you are using has "Full Read" permissions on the SharePoint Web Application being crawled. (The item was deleted because it was either not found or the crawler was denied access to it.'

</message>

The support article by Microsoft will resolve this

This was annoying because the same thing caught me last time - and again I went looking at policies, permissions....argh!

Currently rated 4.0 by 1 people

  • Currently 4/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5

MOSS Search not crawling beyond the first page of a site...

Working on a new intranet site for a client I was puzzled when only the homepage of the root site collection would appear in search results. We have site collections on managed paths below the root site and those weren't showing either. Crawls were running but the crawl log showed "Some parts of this document cannot be accessed". I checked the sharepoint logs and the windows event log, then googled away as usual but there weren't really any further clues, everything seemed to be as it should be.

I tried a reset all content on the search index and this made no difference, next up I tried a new content source as I had I noticed some blogs and forums had mentioned an error meessage saying

The start address http://intranet/sites/sitename is not valid for this content source type.

So I tried creating a content source of SharePoint Sites pointing at the site collections on managed paths, these also gave this message - Ah!, progress I thought as it gave another clue. But alas, no further clues were to be found. In the absence of any further hits on google revealing an insight that would solve it. I tried a restart on the Search Service and the Timer service followed by a full crawl. And straight away the crawl log started to give me more than just one hit on the new location, by the end of the crawl there were all the results I had been expecting first time round.

The only thing I can think confused it was that we recreated the site collections several times during deployment and that caused it to get confused.

I hope that little gem helps you out of a spot if search isn't bringing back the results you're expecting.

 

Be the first to rate this post

  • Currently 0/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5

MOSS Query

Found an amazing tool to build queries for MOSS.

http://www.codeplex.com/SharePointSearchServ/

Be the first to rate this post

  • Currently 0/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5

Search Server Thoughts

I'm part way through the great book on Search Server 2008. Even using Sharepoint and Search every day, it's still interesting to find the things you don't know (or have forgotten Embarassed) about the products.

Some quick notes that I made while reading the book. They are in no particular order, other than I wrote them while reading so probably follow the chapters

If using a separate index server, make it also a Web Front End

The Search Service account will be granted read permissions in the farm and should not be part of the administrators

WSS Search (for searching Help) should be configured to crawl off peak. It only indexes the Help and runs a full crawl each time ?

Look carefully ate crawl impact rules for external sites and check the order of the rules

Schedule any full crawls to run after any daily backup jobs

Manage 'web site' content sources with custom properties and check the hop number (to avoid crawling the net) or set to 'this site only'

You can index Forms Based Authenticated sites using crawl rules and form credentials- point to the login page to crawl

Basic authentication is sent in clear text so implement SSL. You will need to specify the certificate?

You can specify a cookie

Ignore SSL warnings overcomes any self signed certificates

You can configure the Shared Services recently configured crawls web parts to suit your requirements

Search Scope can implement a custom property

Search result removal clears items from the index immediately and prevents explicit results being displayed

Search needs monitoring and management !!!!!!

Index will only do 16mb of a file

.one and .vsd files are not crawled by default and need additional iFilters

Beware of choosing stop / start services from Central Admin - this will clear the index and require a new full crawl

Be the first to rate this post

  • Currently 0/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5

Moss Search results pages are consistently inconsistent in returning slowly!

I've just completed a migration from SharePoint Portal Server 2003 to shiny new Moss 2007 implementation. And whilst looking for some content that had gone missing, I noticed that the search results page sometimes took 30 - 60 seconds to return the next page in the result set, whilst on other occasions it came back very quickly. My first thought was perhaps it's busy, all those users making lots of use of it! But no, processor usage on the boxes was all negligible.

As I explored this some more I discovered that it was consistently inconsistent! By which I mean that the first page of a particular query always came back quickly, the second slow, the third and fourth quickly, then the fifth slow and so on.  I think that's a quick step ( - I've been subjected to too much strictly come dancing), anyhow some digging around eventually showed up the following message in the logs:

 12/08/2008 22:32:19.57  w3wp.exe (0x1134)                        0x0EC4 Search Server Common           MS Search Query                0 High     Exception while finishing web request or starting web page read on http://search.live.com/results.aspx?q=safety&count=3&first=1&mkt=en-GB&format=rss&FORM=SHAREF: System.Net.WebException: Unable to connect to the remote server ---> System.Net.Sockets.SocketException: An existing connection was forcibly closed by the remote host 194.217.240.73:80     at System.Net.Sockets.Socket.EndConnect(IAsyncResult asyncResult)     at System.Net.ServicePoint.ConnectSocketInternal(Boolean connectFailure, Socket s4, Socket s6, Socket& socket, IPAddress& address, ConnectSocketState state, IAsyncResult asyncResult, Int32 timeout, Exception& exception)     --- End of inner exception stack trace ---     at System.Net.HttpWebRequest.EndGetResponse(IAsyncResult asyncResult)     at Microsoft.Office.Server.Search.... 
12/08/2008 22:32:19.57* w3wp.exe (0x1134)                        0x0EC4 Search Server Common           MS Search Query                0 High     ...Federation.HttpAsync.RespCallback(IAsyncResult asynchronousResult) 

and

12/08/2008 22:27:35.20  w3wp.exe (0x1134)                        0x0EC4 Search Server Common           MS Search Query                0 High     Exception while finishing web request or starting web page read on http://search.live.com/QSOnly.aspx?q=safety&count=3&first=1&mkt=en-GB&FORM=SHARES&format=rss: System.Net.WebException: Unable to connect to the remote server ---> System.Net.Sockets.SocketException: An existing connection was forcibly closed by the remote host 194.217.240.71:80     at System.Net.Sockets.Socket.EndConnect(IAsyncResult asyncResult)     at System.Net.ServicePoint.ConnectSocketInternal(Boolean connectFailure, Socket s4, Socket s6, Socket& socket, IPAddress& address, ConnectSocketState state, IAsyncResult asyncResult, Int32 timeout, Exception& exception)     --- End of inner exception stack trace ---     at System.Net.HttpWebRequest.EndGetResponse(IAsyncResult asyncResult)     at Microsoft.Office.Server.Search.F... 
12/08/2008 22:27:35.20* w3wp.exe (0x1134)                        0x0EC4 Search Server Common           MS Search Query                0 High     ...ederation.HttpAsync.RespCallback(IAsyncResult asynchronousResult) 

I thought it a bit strange that a federated search was being performed as there were no federated search webparts on the results page I was looking at. Anyhow it seems that by default the federated searches at Microsoft's live.com are run anyway... disabling these federated search locations in Search Administration > Federated Locations instantly gave me nice quick results all the time.  Well disabling - I added a prefix that's unlikely to get used as I thought they might come in useful when Microsoft has fixed the delay!

If you find another solution - please add a comment to let me know. Thanks.

Be the first to rate this post

  • Currently 0/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5

Sharepoint Podcasts

I travel by train  to the office regularly now. Although this sometimes takes longer than driving, it is more consistent and it gives a couple of hours a day where I can do some reading without interuption (plus I feel I'm doing my bit for the planet! :) )

This will seem geeky, but I get fed up listening to CD's or the radio (though Classic FM is a great soother) and have recently started getting the SharePoint podcasts. personally I find the podcasts easier than webcasts (perhaps because I don't have the distraction of the PC?). You can download the mp3's or find them via iTunes

These are hosted by our friends at Lightning and have some great content for developers and administrators.

Be the first to rate this post

  • Currently 0/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5

Displaying raw XML on search results

Quite often you will need to customise the search results and ensure that your custom metadata mappings are displaying correctly.

To view the raw results, I undertake the following...

a) Edit the results page (results, or people depending on what you are looking to modify

b) Add a new core results web part (again people or results)

c) Add your columns, fixed queries etc

d) Click the Edit XSLT button and replace the XSLT with the following

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" >
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes" />
<xsl:template match="/">
<xmp><xsl:copy-of select="*"/></xmp>
</xsl:template>
</xsl:stylesheet>

Save everything and you can now see the raw results.

Currently rated 5.0 by 2 people

  • Currently 5/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5

 

Dilbert of the day