How do I search/navigate a list of objects create by a powershell command, like get-process?

Is there a way to search a sorted list without piping it to where-object once for each item in my list of keys?  

Im thinking in terms of:

  1.        Take an item from list1. 
  2.        Use list1.property4 to search for a match in list2.property10. 
  3.        Repeat. 

So that a pointer, or whatever, progresses through the list without having to search the entire list2 for each item from list1.  

I'm working with Exchange objects in powershell v2, specifically mailboxes, ad users, and mailbox statistics.

Many thanks in advance!  

April 27th, 2015 5:42pm

It would be helpful if you could give a specific example of a problem you are trying to solve, what you tried, what didn't work, and specifically how it didn't work.
Free Windows Admin Tool Kit Click here and download it now
April 27th, 2015 6:03pm

Thank you for asking!  I'm very happy to elaborate!  

Most of the time, so far, Ive used foreach to do the following:

  1.        $mymbxs = Get-mailbox for an organization. 
  2.        Foreach ($mymbx in $mymbxs)... 
  3.        Get-aduser $mymbx.
  4.        Get-mbxstats $mymbx
  5.        Create a new record in a new table with the required info. 
  6.        Next $mymbxs.

I want discover whether I can save time by working my way through the files in parallel to reduce the execution time.  I dont want to use where-object, as that will require reading the aduser and stats list once, each, for each of the mailboxes. 

  1.        $mymbxs = Get-mailbox for an organization. 
  2.        $myusers = Get-aduser for the same organization.
  3.        $mystats = Get-mailboxstatistics for the organization. 
  4.        Foreach ($mymbx in $mymbxs)... 
  5.        Find $mymbxs aduser info in $myusers. 
  6.        Find $mymbxs mailbox statisitics in $mystats. 
  7.        Create a new record in a new table with the required info. 
  8.        Next $mymbxs.

I want to find out if I can accelerate the process by collecting the info up-front and traversing the array, to eliminate the overhead of making repeated gets for each mailbox. 

Im also looking at possibly using queues to allow several identical processes to collect data in parallel, but haven't worked out the design yet.  

Thanks for the request, and any assistance.  



  • Edited by daddmac 7 hours 45 minutes ago
April 27th, 2015 7:24pm

You can use the pipeline to streamline this. It will use less resources and be more efficient however... you may have concurrency issues.

The actual answer depends on what you final output looks like.

Free Windows Admin Tool Kit Click here and download it now
April 27th, 2015 7:48pm

Thank you!

Using the pipeline, at least as I understand it, I would have to read the aduser info once for each mailbox, and I would have to read the mailbox statistics once for each mailbox.  I have about 50,000 mailboxes.  So, I'll collect 150,000 records, and read 100,000 of them 50,000 times.  

My output is a record containing information from all three lists.  

I'm also open to other ways of doing this, than the one I've described.  

What do you think?  :-)


  • Edited by daddmac 6 hours 19 minutes ago
April 27th, 2015 8:51pm

Clearly you fail, like somany, to understand thepipeline.

Read one user and get that users mailbox - get that mailboxes statistics.

What you propose is exponential. The pipeline is one-to-one.

Why would PowerShell use a pipeline if it worked like you assume it does.  That makes no technical sense.  Go back and revue how the technology is designed to work.  You will be  surprised.

Free Windows Admin Tool Kit Click here and download it now
April 27th, 2015 9:47pm

Thank you for asking!  I'm very happy to elaborate!  

Most of the time, so far, Ive used foreach to do the following:

  1.        $mymbxs = Get-mailbox for an organization. 
  2.        Foreach ($mymbx in $mymbxs)... 
  3.        Get-aduser $mymbx.
  4.        Get-mbxstats $mymbx
  5.        Create a new record in a new table with the required info. 
  6.        Next $mymbxs.

I want discover whether I can save time by working my way through the files in parallel to reduce the execution time.  I dont want to use where-object, as that will require reading the aduser and stats list once, each, for each of the mailboxes. 

  1.        $mymbxs = Get-mailbox for an organization. 
  2.        $myusers = Get-aduser for the same organization.
  3.        $mystats = Get-mailboxstatistics for the organization. 
  4.        Foreach ($mymbx in $mymbxs)... 
  5.        Find $mymbxs aduser info in $myusers. 
  6.        Find $mymbxs mailbox statisitics in $mystats. 
  7.        Create a new record in a new table with the required info. 
  8.        Next $mymbxs.

I want to find out if I can accelerate the process by collecting the info up-front and traversing the array, to eliminate the overhead of making repeated gets for each mailbox. 

Im also looking at possibly using queues to allow several identical processes to collect data in parallel, but haven't worked out the design yet.  

Thanks for the request, and any assistance.  



  • Edited by daddmac Monday, April 27, 2015 11:23 PM
April 27th, 2015 11:21pm

Thank you for asking!  I'm very happy to elaborate!  

Most of the time, so far, Ive used foreach to do the following:

  1.        $mymbxs = Get-mailbox for an organization. 
  2.        Foreach ($mymbx in $mymbxs)... 
  3.        Get-aduser $mymbx.
  4.        Get-mbxstats $mymbx
  5.        Create a new record in a new table with the required info. 
  6.        Next $mymbxs.

I want discover whether I can save time by working my way through the files in parallel to reduce the execution time.  I dont want to use where-object, as that will require reading the aduser and stats list once, each, for each of the mailboxes. 

  1.        $mymbxs = Get-mailbox for an organization. 
  2.        $myusers = Get-aduser for the same organization.
  3.        $mystats = Get-mailboxstatistics for the organization. 
  4.        Foreach ($mymbx in $mymbxs)... 
  5.        Find $mymbxs aduser info in $myusers. 
  6.        Find $mymbxs mailbox statisitics in $mystats. 
  7.        Create a new record in a new table with the required info. 
  8.        Next $mymbxs.

I want to find out if I can accelerate the process by collecting the info up-front and traversing the array, to eliminate the overhead of making repeated gets for each mailbox. 

Im also looking at possibly using queues to allow several identical processes to collect data in parallel, but haven't worked out the design yet.  

Thanks for the request, and any assistance.  



  • Edited by daddmac Monday, April 27, 2015 11:23 PM
Free Windows Admin Tool Kit Click here and download it now
April 27th, 2015 11:21pm

Thank you for asking!  I'm very happy to elaborate!  

Most of the time, so far, Ive used foreach to do the following:

  1.        $mymbxs = Get-mailbox for an organization. 
  2.        Foreach ($mymbx in $mymbxs)... 
  3.        Get-aduser $mymbx.
  4.        Get-mbxstats $mymbx
  5.        Create a new record in a new table with the required info. 
  6.        Next $mymbxs.

I want discover whether I can save time by working my way through the files in parallel to reduce the execution time.  I dont want to use where-object, as that will require reading the aduser and stats list once, each, for each of the mailboxes. 

  1.        $mymbxs = Get-mailbox for an organization. 
  2.        $myusers = Get-aduser for the same organization.
  3.        $mystats = Get-mailboxstatistics for the organization. 
  4.        Foreach ($mymbx in $mymbxs)... 
  5.        Find $mymbxs aduser info in $myusers. 
  6.        Find $mymbxs mailbox statisitics in $mystats. 
  7.        Create a new record in a new table with the required info. 
  8.        Next $mymbxs.

I want to find out if I can accelerate the process by collecting the info up-front and traversing the array, to eliminate the overhead of making repeated gets for each mailbox. 

Im also looking at possibly using queues to allow several identical processes to collect data in parallel, but haven't worked out the design yet.  

Thanks for the request, and any assistance.  



  • Edited by daddmac Monday, April 27, 2015 11:23 PM
April 27th, 2015 11:21pm

Thank you!

Using the pipeline, at least as I understand it, I would have to read the aduser info once for each mailbox, and I would have to read the mailbox statistics once for each mailbox.  I have about 50,000 mailboxes.  So, I'll collect 150,000 records, and read 100,000 of them 50,000 times.  

My output is a record containing information from all three lists.  

I'm also open to other ways of doing this, than the one I've described.  

What do you think?  :-)


  • Edited by daddmac Tuesday, April 28, 2015 12:49 AM
Free Windows Admin Tool Kit Click here and download it now
April 28th, 2015 12:47am

Thank you!

Using the pipeline, at least as I understand it, I would have to read the aduser info once for each mailbox, and I would have to read the mailbox statistics once for each mailbox.  I have about 50,000 mailboxes.  So, I'll collect 150,000 records, and read 100,000 of them 50,000 times.  

My output is a record containing information from all three lists.  

I'm also open to other ways of doing this, than the one I've described.  

What do you think?  :-)


  • Edited by daddmac Tuesday, April 28, 2015 12:49 AM
April 28th, 2015 12:47am

Thank you!

Using the pipeline, at least as I understand it, I would have to read the aduser info once for each mailbox, and I would have to read the mailbox statistics once for each mailbox.  I have about 50,000 mailboxes.  So, I'll collect 150,000 records, and read 100,000 of them 50,000 times.  

My output is a record containing information from all three lists.  

I'm also open to other ways of doing this, than the one I've described.  

What do you think?  :-)


  • Edited by daddmac Tuesday, April 28, 2015 12:49 AM
Free Windows Admin Tool Kit Click here and download it now
April 28th, 2015 12:47am

Harsh rebukes aside, JRV has a point that the pipeline is 1:1.

You can however do what you want in parallel, using a workflow with the -parallel parameter 

(http://blogs.technet.com/b/heyscriptingguy/archive/2012/12/26/powershell-workflows-the-basics.aspx)

workflow Get-mailboxstuff{
   param([string[]]$UPNs)
   foreach parallel ($UPN in $UPNs){
             sequence {
                    #1:1 pipeline command to get and 
                    #set what you want for 1 user
             }

    }
}

$users = #use command to get the right users/mailbox users

Get-mailboxstuff -UPNs $users

This is a non functional example script to get you in the right direction, and offcourse it can be done in less lines (JRV and others, do correct my own errors)

April 28th, 2015 5:38am

Restating my original question, to make the process more efficient, I want to not use where (which would implicitly use a pipeline):
1) make a single call for each dataset to store all the data in three variables, instead of making one call per mailbox.
$mbxs = get-mailboxes -resultsize unlimited
$adinfo = get-aduser -resultsize unlimited
$stats = get-mailboxstatistics -resultsize unlimited
2) traverse the lists of data one time, progressing as I process each key mailbox.

Thanks for clarifying the one-to-one problem, but while the problem is arithmetic, not exponential, you got my point.   the math looks like this:
mbxdata + (mbxdata*adinfodata) + (mbxdata*statsdata)

You said the pipeline would do what i want , "You can use the pipeline to streamline this." I'm asking you how that would look in pseudo-code.  

The following pseudo-code illustrates the problem of the reads, which you called out in your second comment.

#one pass through the key list.

foreach ( $mailbox in $mailboxes ){

$aggragate = "" | select "mbxid","adinfoa","adinfob","statsinfoa","statsinfob"

#read aduser info.

#one pass through adinfo.

$adinfo = $mailbox | where {aduser.upn -eq $mailbox.upn} | select upn...

#one pass through mbxstats.  

$statsinfo = $mailbox | where {$mbxstats.dn -eq $mailbox.dn} | select total...

}

Free Windows Admin Tool Kit Click here and download it now
April 28th, 2015 8:52am

I appreciate your professionalism in taking the time to review my question and respond so well!  

That is exactly the kind of information I was hoping for, or for someone to tell me it can't be done!  

I'll review the link and see if I can work something out from the code you listed and post an update.  

Again, many thanks to you!

April 28th, 2015 9:00am

For the above stated reasons your method is very inefficient and will not likely work.

Do you have a specific question or are you just speculating on how you might do this?

Free Windows Admin Tool Kit Click here and download it now
April 28th, 2015 9:01am

You can look through these examples to see how the CmdLets are used to gather statistics.  I think you will begin to se how to use these things more efficiently.

https://gallery.technet.microsoft.com/scriptcenter/site/search?query=mailbox%20statistics&f%5B0%5D.Value=mailbox%20statistics&f%5B0%5D.Type=SearchText&f%5B1%5D.Value=Exchange&f%5B1%5D.Type=RootCategory&f%5B1%5D.Text=Exchange&ac=4

April 28th, 2015 9:06am

Thank you for your comment.  

The question was:
Can i use Powershell commands to lookup up multiple keys in a data set while traversing the data set only one time?  

While it may not work in all situations, here is an example of how the bulk collection of data is much more efficient than the conventional method of collecting data piecemeal: 
Previously, we downloaded a well known script that creates a status report of Exchange services.  However, because of the size of the environment, the off-the-shelf run time was about an hour on average.  We wanted an hourly report, so it was too slow.  

The script executes requests for data as it reads a list of objects it uses for keys. Because of the excellent quality of the original code (kudos to Paul Cunningham), I was able to modify the script to gather all the data about all the objects at the top of the execution, and replaced the subsequent "get-..." commands with queries against the variables where the data was stored.  The run time was reduced to about 15 minutes.  Now the script is 45 minutes faster, and useful for producing an hourly report.  
Free Windows Admin Tool Kit Click here and download it now
April 28th, 2015 9:27am

Thank you JRV!  That will be very helpful!

I appreciate the link and will review it today.  
April 28th, 2015 9:37am

You keep posting about "datasets".  PowerShell does not have datasets.  They are objects and object collections.  They cannot be treated like datasets.

In a pipeline we are not collecting data piecemeal. The pipeline allows us to efficiently manipulate the information and convert it into a final form.

Without know what your final objects look like it is impossible for anyone to understand what your issue is.  It is likely that you are using the wrong CmdLets and objects since the CmdLets are very redundant.

There are also many other considerations that you are missing.

Free Windows Admin Tool Kit Click here and download it now
April 28th, 2015 10:03am

You will need to understand PowerShell basics, particularly the pipeline, to process data efficiently in PowerShell.

Start with reading and understanding this help topic:

April 28th, 2015 10:22am

You are correct, in that I have been using "data set" as a generic descriptor of a collection, or "set", of objects.  I do this to avoid excluding anyone from the conversation who may have good ideas or constructive comments, but may not be familiar with the vernacular.  I was not using it in the sense of textual (string) listings of data.  

Also, my use of the work "piecemeal" was to indicate that information (in the form of objects), about a given object, is retrieved ("get-...") as needed.  This is in contrast to the preparatory bulk collection of all the information that will be needed for the entire operation.

What are the correct commands should be used to retrieve ADUser or MailboxStatistics information instead of get-aduser or get-mailboxstatistics?  

What considerations am I missing?   
Free Windows Admin Tool Kit Click here and download it now
April 28th, 2015 11:34am

Thank you Bill, I will do that!
April 28th, 2015 11:38am


What are the correct commands should be used to retrieve ADUser or MailboxStatistics information instead of get-aduser or get-mailboxstatistics?  

It all depends on what properties you want to report.

Using dataset is misleading to you.  A dataset is either a collection of related SQL tables or a set of measurements used in statistical analysis.  THe objects here are not a dataset.

Exchange is not a relational database and the data you seek is not about tables or lists.  It is about very loosly related objects.

IF you are running remotely you will have to first grab the mailboxes then enum them and add selected object data,  Doing this in the pipeline correctly provides the best performance and resource usage.

Free Windows Admin Tool Kit Click here and download it now
April 28th, 2015 11:45am

Koen, your example looks like could work very well for what I need to accomplish, and in several other projects.  

I'm working in v2, and it does not appear to be supported.  When we get to v3+ I'll certainly be using it.  

Again, thanks for being so helpful!

April 28th, 2015 1:30pm

Yes it does require 3.0 as a minimum.

But until that time you can use a foreach loop, it just takes longer.

Just start with 'help foreach' in powershell and work from there.

Free Windows Admin Tool Kit Click here and download it now
April 28th, 2015 3:49pm

Thank you all for your time to respond with very qualified suggestions.  

Koen's response is the closest to what I am looking for.  

I will have to delay implementing it as the servers are limited to v2 by the Exchange version.  

In the meantime, I'll pursue using queues and child workers as an alternative.  

April 29th, 2015 7:42am

Just as a note, Exchange 2010 can happily coexist with WMF4 as long as you're on SP3 with UR5.
Free Windows Admin Tool Kit Click here and download it now
April 29th, 2015 9:18am

Thanks Mike!  I appreciate the note.  I think it's a very big deal, as there are a number of efficiencies and other enhancements would benefit from.  

I was very enthusiastic and we looked into that option when ur5 came out.  We were concerned about changes that may affect how we run existing scripts, and getting all our staff onboard with the configuration.  So, we decided to wait for the OS and application upgrades before making the transition.  

April 29th, 2015 3:06pm

While doing a little more research into Koen's information about foreach -parallel, I found the following article that seems to be worth testing out.  

FYI:
https://serverfault.com/questions/626711/how-do-i-run-my-powershell-scripts-in-parallel-without-using-jobs

Sharing the wealth.  

Free Windows Admin Tool Kit Click here and download it now
April 30th, 2015 8:50am

ow it was  Exchange 2010, should have spotted that, thanks Mike!
April 30th, 2015 11:31am

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics