Openfire (ARCHIVED)

Use paged results in LDAP queries

Details

  • Type: Improvement Improvement
  • Status: Closed Closed
  • Priority: Minor Minor
  • Resolution: Fixed
  • Affects Version/s: None
  • Fix Version/s: 3.6.0
  • Component/s: Core
  • Labels:
    None
  • Environment:

    LDAP systems with large numbers of users

  • Acceptance Test - Add?:
    No

Description

LDAP servers can page results from a query, allowing for extremely large queries to be fulfilled efficiently. Openfire should be modified to use this method instead of relying on a single "large page" to return results.

Activity

Hide
Daniel Henninger added a comment -

First iteration of this is not turning out well. It's not clear if it's just because the person's LDAP db is way too large, or if it's because of a bug in the code. Testing is being doing to narrow this down.

Show
Daniel Henninger added a comment - First iteration of this is not turning out well. It's not clear if it's just because the person's LDAP db is way too large, or if it's because of a bug in the code. Testing is being doing to narrow this down.
Hide
Aaron Axelsen added a comment -

Daniel: I'll be happy to help test this once you get something. We have a 30,000 or so user AD.

Show
Aaron Axelsen added a comment - Daniel: I'll be happy to help test this once you get something. We have a 30,000 or so user AD.
Hide
Andreas Sieferlinger added a comment -

Would be great if this would be implemented soon, if you hava a 20,000 user AD and only get 1000 you can change this on your DC but the recommondation from MS is to have a maximum of 1000 so this is not the best way.

Show
Andreas Sieferlinger added a comment - Would be great if this would be implemented soon, if you hava a 20,000 user AD and only get 1000 you can change this on your DC but the recommondation from MS is to have a maximum of 1000 so this is not the best way.
Hide
Daniel Henninger added a comment -

You'll be pleased to know that I've implemented this — the caveat being I've got a person testing it (and Aaron, I might take you up on that soon if you have a test environment, I'll ping you over IM) and something went into never never land when I did it. =/ Sigh. I also just realized I might have done something dippy. But at least it's useful for testing. right now I'm having the administrator set a page size but I'm not sure there's a point in that. It might be better to just pull everything I can and then use the paged results to get more. Either way, it's in progress, but right now does not appear to be working well. It's not yet clear whether it's just that upset about so many users or not. I kind of doubt it. Should be able to handle plenty!

Show
Daniel Henninger added a comment - You'll be pleased to know that I've implemented this — the caveat being I've got a person testing it (and Aaron, I might take you up on that soon if you have a test environment, I'll ping you over IM) and something went into never never land when I did it. =/ Sigh. I also just realized I might have done something dippy. But at least it's useful for testing. right now I'm having the administrator set a page size but I'm not sure there's a point in that. It might be better to just pull everything I can and then use the paged results to get more. Either way, it's in progress, but right now does not appear to be working well. It's not yet clear whether it's just that upset about so many users or not. I kind of doubt it. Should be able to handle plenty!
Hide
Andreas Sieferlinger added a comment -

if you want i also could test it, do you also page the groups?

Show
Andreas Sieferlinger added a comment - if you want i also could test it, do you also page the groups?
Hide
Daniel Henninger added a comment -

Yup! We page groups as well. =) I do wonder ... what is the max number of members of a specific group? If you have 20,000 users in a single group, that's a loooooot of users. =) (and one big single response entry)

Show
Daniel Henninger added a comment - Yup! We page groups as well. =) I do wonder ... what is the max number of members of a specific group? If you have 20,000 users in a single group, that's a loooooot of users. =) (and one big single response entry)
Hide
Daniel Henninger added a comment -

This came up in chatting earlier —

What are you all looking to see out of paged results?

What are you not getting right now?

The user summary page does not benefit from paged results much, unfortunately. LDAP paged results are not something where you can say "give me result 50 through 100". Instead, it's just an extension of how many results you can get. For example, if your server limits to 1000 results, you are given a token that you can use to get the next 1000 results. You can't create said token and say "start from 2000". It's something the server shares with you and is tracking itself.

So that said. The only thing this is going to solve is where you are running into problems because you have more of something than Openfire can grab and yet it matters.

I say that because, the number of users returned doesn't matter much. When a user logs in, their direct profile is looked up, so you aren't doing a full search. In theory the list of all users isn't that important. (I haven't analyzed all of the code yet, so I might be missing some places where it is important) I do know that I had more users (65K) than I could pull in a single result (500 cap, no paged results on that server), and folk could still log in. So some places where I could potentially think of that would be a problem would be groups, where you need to see all of the groups ... and really I'm not sure what else right now.

Aaron suggested maybe an option so you can disable the user summary page. I think I asked for that at one point as well in the past. That way if you know you have too many users to deal with the user summary page, you can disable it's use and prevent accidental loads of the entire db.

Anyway, let me know. =)

Show
Daniel Henninger added a comment - This came up in chatting earlier — What are you all looking to see out of paged results? What are you not getting right now? The user summary page does not benefit from paged results much, unfortunately. LDAP paged results are not something where you can say "give me result 50 through 100". Instead, it's just an extension of how many results you can get. For example, if your server limits to 1000 results, you are given a token that you can use to get the next 1000 results. You can't create said token and say "start from 2000". It's something the server shares with you and is tracking itself. So that said. The only thing this is going to solve is where you are running into problems because you have more of something than Openfire can grab and yet it matters. I say that because, the number of users returned doesn't matter much. When a user logs in, their direct profile is looked up, so you aren't doing a full search. In theory the list of all users isn't that important. (I haven't analyzed all of the code yet, so I might be missing some places where it is important) I do know that I had more users (65K) than I could pull in a single result (500 cap, no paged results on that server), and folk could still log in. So some places where I could potentially think of that would be a problem would be groups, where you need to see all of the groups ... and really I'm not sure what else right now. Aaron suggested maybe an option so you can disable the user summary page. I think I asked for that at one point as well in the past. That way if you know you have too many users to deal with the user summary page, you can disable it's use and prevent accidental loads of the entire db. Anyway, let me know. =)
Hide
Andreas Sieferlinger added a comment -

The most important is, i think, to manage the roster group shares
you don't need to manage the users and their group memberships, that's what the AD does, so an option to diable that time consuming listing would be great.

Show
Andreas Sieferlinger added a comment - The most important is, i think, to manage the roster group shares you don't need to manage the users and their group memberships, that's what the AD does, so an option to diable that time consuming listing would be great.
Hide
Trevor Scroggins added a comment -

This was immediately flagged as an issue during an evaluation of Openfire at my firm. In leu of a paged search control, this might be implemented using a combination of a sort control and a virtual list view control on directories that support them, e.g. Active Directory. While inteded to be used for populating traditional list-style GUI elements, it should suit itself well to "paged" results displayed in an HTML table. A stateless implementation, however, would lack the performance of a stateful implementation using LDAP server-side cookies to repeat searches for the same virtual list. Even so, it would be no worse than the search functionality in place today and should provide a complete result set for directories that limit their nonpaged query results to a limited number of entries.

Show
Trevor Scroggins added a comment - This was immediately flagged as an issue during an evaluation of Openfire at my firm. In leu of a paged search control, this might be implemented using a combination of a sort control and a virtual list view control on directories that support them, e.g. Active Directory. While inteded to be used for populating traditional list-style GUI elements, it should suit itself well to "paged" results displayed in an HTML table. A stateless implementation, however, would lack the performance of a stateful implementation using LDAP server-side cookies to repeat searches for the same virtual list. Even so, it would be no worse than the search functionality in place today and should provide a complete result set for directories that limit their nonpaged query results to a limited number of entries.
Hide
Daniel Henninger added a comment -

Interesting, I'd love to hear some more details on this. Do you happen to have any links you could share? Active Directory supports the paged search control of course, and so we should be able to get full results as of the last couple of updates I did. However, loading the entire list is, how shall we say, slow. I've never heard of this virtual list view control you are referring to.

Show
Daniel Henninger added a comment - Interesting, I'd love to hear some more details on this. Do you happen to have any links you could share? Active Directory supports the paged search control of course, and so we should be able to get full results as of the last couple of updates I did. However, loading the entire list is, how shall we say, slow. I've never heard of this virtual list view control you are referring to.
Hide
Trevor Scroggins added a comment -

VirtualListViewControl and VirtualListViewResponseControl classes implemented by Sun are bundled with the JNDI/LDAP Booster Pack, available from http://java.sun.com/products/jndi/downloads/index.html. The control itself is described in http://www.ietf.org/proceedings/02nov/I-D/draft-ietf-ldapext-ldapv3-vlv-09.txt.

The sequence of events might look like this:

1. Query directory for supported controls (from the JNDI tutorial):

DirContext ctx = new InitialDirContext();
Attrbutes attrs = ctx.getAttributes("ldaps://hostname:636", new String[] {"supportedControl"});

supportedControl should contain the following controls for virtual list view support:

1.2.840.113556.1.4.473 (server sort)
2.16.840.1.113730.3.4.9 (virtual list view)

(Incidentally, step one could be used for all directories to determine whether or not server-side sort is available, in which case the Openfire option to enable client-side sorting becomes an override of default behavior and not a "fix" for directories that lack the control. If the control isn't supported, you can simply enable client-side sorting or not sort at all, depending on the override setting.)

2. Initialize sort and virtual list view controls.
3. Initiate search.
4. Process results.
5. Repeat.

Details on how the control itself are documented in the IETF draft. Googling for VirtualListViewControl should produce some useful samples. Some tuning of JVM memory settings may be necessary for large result sets (tens of thousands of records), depending on how the results are parsed and integrated into the Openfire administrator. An acceptable cost/performance ratio should be easily obtained by storing the search cookie, querying the directory as needed, and letting the browser cache keep track of previous results.

Show
Trevor Scroggins added a comment - VirtualListViewControl and VirtualListViewResponseControl classes implemented by Sun are bundled with the JNDI/LDAP Booster Pack, available from http://java.sun.com/products/jndi/downloads/index.html. The control itself is described in http://www.ietf.org/proceedings/02nov/I-D/draft-ietf-ldapext-ldapv3-vlv-09.txt. The sequence of events might look like this: 1. Query directory for supported controls (from the JNDI tutorial): DirContext ctx = new InitialDirContext(); Attrbutes attrs = ctx.getAttributes("ldaps://hostname:636", new String[] {"supportedControl"}); supportedControl should contain the following controls for virtual list view support: 1.2.840.113556.1.4.473 (server sort) 2.16.840.1.113730.3.4.9 (virtual list view) (Incidentally, step one could be used for all directories to determine whether or not server-side sort is available, in which case the Openfire option to enable client-side sorting becomes an override of default behavior and not a "fix" for directories that lack the control. If the control isn't supported, you can simply enable client-side sorting or not sort at all, depending on the override setting.) 2. Initialize sort and virtual list view controls. 3. Initiate search. 4. Process results. 5. Repeat. Details on how the control itself are documented in the IETF draft. Googling for VirtualListViewControl should produce some useful samples. Some tuning of JVM memory settings may be necessary for large result sets (tens of thousands of records), depending on how the results are parsed and integrated into the Openfire administrator. An acceptable cost/performance ratio should be easily obtained by storing the search cookie, querying the directory as needed, and letting the browser cache keep track of previous results.
Hide
Trevor Scroggins added a comment -

Regarding group members, you can enumerate a large number of members (or a large number of values for any multi-valued attribute) using range retrieval. It's very simple, and you may already using it. In a search, append ";range=<low>-<high>" to the attribute name to retrieve a subset of values for that attribute, e.g. "member:range=1-50". You can use '0' and '*' for low and high to denote the beginning or end of the list of values, respectively.

The results won't be sorted, which is a drawback. A poster on one of Sun's forums recommended sorting a search for (&(objectCategory=user)(memberOf=<group dn>)) as a workaround, but that also has drawbacks.

I don't know whether or not range retrieval is supported on directories other than Active Directory.

Show
Trevor Scroggins added a comment - Regarding group members, you can enumerate a large number of members (or a large number of values for any multi-valued attribute) using range retrieval. It's very simple, and you may already using it. In a search, append ";range=<low>-<high>" to the attribute name to retrieve a subset of values for that attribute, e.g. "member:range=1-50". You can use '0' and '*' for low and high to denote the beginning or end of the list of values, respectively. The results won't be sorted, which is a drawback. A poster on one of Sun's forums recommended sorting a search for (&(objectCategory=user)(memberOf=<group dn>)) as a workaround, but that also has drawbacks. I don't know whether or not range retrieval is supported on directories other than Active Directory.
Hide
Trevor Scroggins added a comment -

One last comment: the virtual list view control is dynamic, so you can change the size and location of the results "window" during a search. (That maps nicely to a dynamically scrollable and resizeable list view GUI element, of course.)

Show
Trevor Scroggins added a comment - One last comment: the virtual list view control is dynamic, so you can change the size and location of the results "window" during a search. (That maps nicely to a dynamically scrollable and resizeable list view GUI element, of course.)
Hide
Andreas Sieferlinger added a comment -

A short example:
If i search in groups for 'csm' then i get an empty result page, but on the top there are 17 pages shown, so i think this is also an issue related to the paged results, 'cause we have about 40.000 groups

Show
Andreas Sieferlinger added a comment - A short example: If i search in groups for 'csm' then i get an empty result page, but on the top there are 17 pages shown, so i think this is also an issue related to the paged results, 'cause we have about 40.000 groups
Hide
Andreas Sieferlinger added a comment -

I also experienced the Problem that If i enable Contact-List sharing the server takes about 30 seconds and only adds 4 from 16 users to the shared roster
I first thought that this 4 users are only users which are in the first 1000 of results, but they're not.
That's really a problem, do the sharing for all 16 users by hand is some work.

Show
Andreas Sieferlinger added a comment - I also experienced the Problem that If i enable Contact-List sharing the server takes about 30 seconds and only adds 4 from 16 users to the shared roster I first thought that this 4 users are only users which are in the first 1000 of results, but they're not. That's really a problem, do the sharing for all 16 users by hand is some work.
Hide
Daniel Henninger added a comment -

That actually shouldn't be the case. The searches do "true ldap searches", so the paged results should be irrelavent assuming you aren't getting back a lot of pages of csm results. =) Anyone else running into problems with searching?

As for the contact list sharing, that's something I plan on looking into. It's kind of bizarre that it takes that long. My assumption is that it's doing a getUsers search even if it doesn't have to. It's hard to say without trying it some more though and watching what occurs.

Show
Daniel Henninger added a comment - That actually shouldn't be the case. The searches do "true ldap searches", so the paged results should be irrelavent assuming you aren't getting back a lot of pages of csm results. =) Anyone else running into problems with searching? As for the contact list sharing, that's something I plan on looking into. It's kind of bizarre that it takes that long. My assumption is that it's doing a getUsers search even if it doesn't have to. It's hard to say without trying it some more though and watching what occurs.
Hide
Daniel Henninger added a comment -

Trevor, do you know of a control id for the ranged search results for attributes? In other words, is there a way I could ask the LDAP server "hey, do you support this?" and then use it, similar to the other features?

Show
Daniel Henninger added a comment - Trevor, do you know of a control id for the ranged search results for attributes? In other words, is there a way I could ask the LDAP server "hey, do you support this?" and then use it, similar to the other features?
Hide
Trevor Scroggins added a comment -

According to Microsoft's documentation and the (expired) Internet draft "Incremental Retrieval of Multi-valued Properties" <http://www.ietf.org/proceedings/02mar/I-D/draft-kashi-incremental-00.txt>, supportedControl (incorrectly identified in the draft but correctly indentified here) should contain 1.2.840.113556.1.4.802. As far as I know, Active Directory is the only directory that supports this option.

Not all clients correctly parse the results, as the option is not, strictly spearking, implemented in a standard way. The JNDI LDAP provider should work correctly, though.

Show
Trevor Scroggins added a comment - According to Microsoft's documentation and the (expired) Internet draft "Incremental Retrieval of Multi-valued Properties" <http://www.ietf.org/proceedings/02mar/I-D/draft-kashi-incremental-00.txt>, supportedControl (incorrectly identified in the draft but correctly indentified here) should contain 1.2.840.113556.1.4.802. As far as I know, Active Directory is the only directory that supports this option. Not all clients correctly parse the results, as the option is not, strictly spearking, implemented in a standard way. The JNDI LDAP provider should work correctly, though.

People

Vote (3)
Watch (4)

Dates

  • Created:
    Updated:
    Resolved: