Sorry, you need to enable JavaScript to visit this website.

Picozed 7015 + Carrier Card + Gig SFP

Unsolved
7 posts / 0 new
manabit's picture
manabit
Junior(0)
Picozed 7015 + Carrier Card + Gig SFP

Hello All,

I'm attempting to use the Xilinx 1g/2.5g PCS/PMA to connect a Finisar SFP module in the SFP slot of the carrier card to the PS GEM1 on a Picozed 7015. My design is similar to that of xapp1082 but adapted to the picozed.

I'm using Vivado 2015.2, and a new picozed 7015 and carrier card.

The carrier card has been configured to output a 125Mhz clock (mgtrefclk1). The independent clock for the core is being provided by the PS oscillator (200MHz). Both clocks have been verified by externally connecting a scope.

I've been working on this for several weeks and am encountering a persistent issue where the transceiver GTP PLL0 fails to stay locked ONLY when the SFP is inserted in the slot.

I originally thought the core was sending a reset signal to the PLL so i implemented the PLL in the example design and removed the reset connection between the PCS/PMA and the PLL and tied it to gpio so that I could manually reset the PLL. In this configuration, I can reset the PLL and it is again stable until the SFP is inserted, at which point the PLL locked signal will begin to toggle.

I have 2 carrier cards and 2 picozeds, and this behavior is exhibited on both.

I've also got several SFP's. The finisar copper sfp's cause the pll to disconnect instantly upon being connected. (or the pll never maintains a lock if the sfp was in the slot when the board was powered on). The finisar fiber (850nm) sfp's can be inserted without disrupting the lock but will once an optical input is tied to the sfp rx port, then the lock will be lost. I have dozens of SFP's and I've connected several different ones and the behavior is the same with every one I've tried.

At this point, I'm wondering if anyone out there has managed to use the SFP port on the carrier card (I can't possibly be the first?!). I'm a computer programmer and relatively new to FPGA design and I therefore don't discount the possibility that there may be a flaw in my bitstream.

Any wisdom would be appreciated.

zedman2000's picture
zedman2000
Moderator(2)
Hi there,

Hi there,

I have personally used Finisar SFP+ modules with this carrier card for both a 7015 and 7030. I have not seen the behavior you are seeing.

I think first, let's validate all of your hardware. To validate your hardware, can you run through the IBERT design that was posted? You can find the walk-through here:
http://picozed.org/support/design/4701/76
Search "IBERT Design"
Feel free to extract the prebuilt project and run that as a QUICK validation. I would suggest walking through the document to familiarize yourself with the IBERT tool set as it might become necessary for our troubleshooting for you to be familiar with it.

You can also check these tech-tip videos out on how I setup and configured the board-set.
http://picozed.org/support/trainings-and-videos
Tech Tip - Transceiver Tools 101: Intro to IBERT
Tech Tip - Transceiver Tools 102: We have an IBERT bit stream, now what?
Tech Tip - Transceiver Tools 103: Now that we are running, what are all these adjustments?
Tech Tip - Transceiver Tools 104: Getting More Margin

This should prove that your hardware is good or not. Please note that there is a laser ENABLE that needs to be set in order to run the SFP with the design. The walkthrough describes what you need to change in order to set this - or use the prebuilt image. That is the image I use in the design.

Please let me know how that goes and then we can take the next step of figuring out the PLL locking issue.

--Dan

manabit's picture
manabit
Junior(0)
IBERT Testing

Hi Dan,

OK, I spent considerable time over the last several days trying to validate the hardware and I'm getting strange results.

Firstly, I followed the ibert video's you made step by step including altering my board config to a 250 mhz clock. I only have the SFP+ loopback (no sma cables, fmc or pcie at this time) my output is the same as yours with the following differences: I'm now using 2015.3 (was using 2015.2 before) and a xilinx platform cable usb 2 once I get to autodetect serial links screen, my Vivado detects none.

Then it gets goofy. I have 2 carrier cards and 2 7015's. The first card with pll/xcvrs configured as per your video will not connect - all xcvrs in near-end pma loopback, pll locked, but no link. Multiple attempts at reset fail to allow the device to link.

Now the second board:

If I configure it for internal loopback (near-end pma), i can get all channels to lock if the sma loopback adapter isn't in the slot... I let it run for 10 minutes or so without bit errors in internal loopback but once I plug in the sfp loopback adapter, the links go down and I can't get the internal near-end pma loopback to stay linked and error free on all 4 channels at once (certain reset combinations will land me with 0 and 1 connecting 2 and 3 no link etc..). Removing the sma loopback plug returns the internal loopback to functional state on all 4 channels.

Then it gets seriously voodoo -- If, using hardware manager, I configure the gtp common to use GTREFCLK0 as the clock source for PLL0, the pll again returns to lock, but all loopbacks work instantly (@3.111 ghz). zero bit errors. I remove the internal loopback from the SFP and, again, I get zero bit errors going through the actual sfp loopback. As a sanity check, I remove the sfp loopback plug to confirm that mgt2 goes down - it does - reinstall it - link comes back instantly. This behavior is identical on both of my carriercard/picozed combinations. (i did also try swapping a picozed to the other carrier card to see if perhaps I had a working combination but they both behave similarly).

I double checked the schematic and every piece of documentation I can find and there MGT_REFCLK0 is clearly connected to the ICS874003 jitter attenuator... I'm presuming that the VCO in this device still outputs a stable clock in the absence of an input clock signal?

I seem to be unable to get an error free loopback when the PLL0 is connected to MGT_REFCLK1(gtrefclk1) when an sfp is in the slot. In my design(where i was using a 125 mhz configuration), the pll would not stay locked when sfp's were in place (as mentioned) - This is not the behavior I usually see in the ibert (though its done it to me once or twice on REFCLK1 - REFCLK0 seems bulletproof but its obviously not properly clocked)

On both boards, changing the pattern from prbs 7 to 31 bits and resetting causes the refclk1 clocked transceivers, in internal loopback, to disconnect constantly and have many errors. Setting to refclk0 and 31-bit still leads to zero bit errors, even when looped back externally through the sfp.

I ordered a picozed 7030 yesterday. At this point I'm assuming that ibert is probably nearly impossible to mess up given its simplicity and that the issue I'm seeing points to some type of hardware issue with the CDCM61002 or wiring/power to/from it... I did buy both carrier cards and both picozeds in the same order at the end of June.

Also note that I bought an loopback sfp+ to ensure that the laser on signal didn't play into any of my tests. I had to hack up the edit the earlier ibert example when using a 850nm sfp but it doesn't seem to be required for my loopback (which is probably just looped on the pcb)

Your insight is appreciated.

zedman2000's picture
zedman2000
Moderator(2)
Hi there Mark,

Hi there Mark,

First, thanks for being thorough. It helps as I cannot be there with you to see everything.
Second, I do not recommend connecting the SFP modules WHILE the system has power. I personally have never done that.
Third, it seems to me that the clock is probably not correct. The MGTREFCLK0 will ALWAYS have a clock on it. When the IDT 874003 has no input clock (your situation) it will still run and generate a CLOCK. Depending on how SW8 is setup, you are probably generating 250MHz, which is the default factory configuration. Since you followed my guide, I had you set everything up for 250MHz. I find it strange that you are not getting 250MHz after matching the configuration that was recommended. Can you double check the SW9/SW10 setup? with the PCIe card edge nearest you, you should see CLOSE/OFF; CLOSE/OFF ; CLOSE/OFF; AWAY/ON; AWAY/ON; CLOSE;OFF.
There is an image in section 2b on page 17 of the IBERT documentation.

I am glad that you have ordered a SFP Loopback adapter, that will remove questions about the Verilog modifications. Another way to remove this question, a few days ago, the IBERT reference designs were updated to REV 1.1. This now includes a scripted method to configure the IBERT design. You basically download the GIT repository, extract, open Vivado, run the appropriate command (see documentation) - then the scripted environment does everything else, including file manipulation and JTAG.

The reason I am circling back to the clock is looking over your procedures above...you claim to have loopback running without issue when using MGTREFCLK0 (which as I said should be defaulting to 250MHz). Assuming you have the same IBERT image, just changing ONLY the clock input field (see #13 on page 11) there is no reason the IBERT would not run just as well for either. Also, make certain that you are setting the System clock to using the Quad 112 clock, although if that were the issue, you would probably have a non-working IBERT using MGTREFCLK0 and 1.

Can you tell me what the @3.111GHz number is from? Is that from what Vivado is telling you it is seeing as a recovered clock? What number do you see in the Status column?

This statement also bothers me: "all xcvrs in near-end pma loopback, pll locked" If the PLL is locked, with near-end PMA it should be working. When in this situation, have you tried near-end PCS? That is digital ONLY and will validate the image is working as expected (clocks, data shifters, as well as other logic). When in this situation, have you tried to click the RX Reset button? There are a few troublshooting tips in Video 103 near 3:45 or so. Also some details about WHAT you can expect to see to KNOW that everything is running AOK.

A quick note, I'm not sure if you have changed any of the bank voltages when using your PicoZed 7015 SOM, but remember that the PicoZed 7030 is a 1.8V device! If you have CON2 set to anything but 1.8V or disconnected (no jumper) you will damage your ZC7030 based SOM!!

--Dan

manabit's picture
manabit
Junior(0)
IBERT Testing 2

I made you a video:

https://youtu.be/RpVUA37LX58

I want to stress that I've obviously attempted doing the ibert test from a fresh bootup of the board and restart of Vivado without making changes to the pll. I've now followed your video through a few times and even tried loading the files (that was a while back). I'm confident that I've followed the instructions properly. jumpers are configured (with pcie edge connector facing downwards) SW8: all down(off) SW9: all down(off) SW10 down, up, up, down -- but as you stated, the improper configuration of these jumpers would likely lead to a different clock speed but, you'd think, that it would still come up in an ibert loopback unless it created a rate that was completely out of range.

I also tried near-end PCS. I wish I had also recorded that behavior on the video but it took my phone a while to upload it over wifi - in that mode, links come and go with the GTREFCLK1 clock source and errors are present on all ports - but unlike "near end pma", the links will actually connect.

When I play with the pll0 properties while its on GTREFCLK0, its really robust and works as expected... i.e. changing the divider properties and resetting the tx/rx results in an expected change in rate and the connection re-establishes itself. On GTREFCLK1, it always works badly, but some settings yield better connectivity the others. A 7 bit pattern tends to stay connected more than a 31 bit etc...

I went into the properties and removed the powerdown flag from pll1, reset it and attached it to gtrefclk1 then tried moving the rx+tx for a single channel to pll1 - pll1 worked the same way as pll0, stable on GTREFCLK0 and unreliable on GTREFCLK1.

Also, thanks for the heads up on the 1.8v - had already decided to go to 1.8v in case we moved to the 7030.

Warm Regards,
Mark

manabit's picture
manabit
Junior(0)
Works on Z7030

Hi,

I received the 7030 I had ordered on Friday and did the ibert tests on the same carrier card. All tests passed on the first attempt with internal or external loop back. I was able to switch the clocks back and forth and the system was stable on either refclk1 or 0... Sounds like I may have 2 dead 7015's (?)

zedman2000's picture
zedman2000
Moderator(2)
Mark,

Mark,

The video was very helpful. Thank you for posting that.
I notice you are running the Linux version of the tools. Just out of curiosity, what is the default language of your operating system? We've see the non-refreshing issue and it seemed to be more prevalent with Linux using a non-English, however can be an issue under Windows again, without English as it's default.

Can you tell me what position JP6 is setup as? Is there a jumper installed? That MUX would prohibit the clock 1 from getting to the PicoZed if configured incorrectly. The Jitter Attenuator has a straight shot into the PicoZed MGTREFCLK0. (this is on page 3 of the PicoZed Rev C Carrier card schematic).
[ http://picozed.org/support/documentation/4701 search Schematics ] I am mostly asking to get the complete picture as you stated your PZ7030 is working just fine.

Were you able to perform the rework listed in the Errata? Same webpage as above, search [PicoZed FMC Carrier Card Rev C Errata]. I searched our thread and do not see the word Errata, so I think I did not mention it and you may not have had the rework performed.

There is a possibility that something went wrong when hot plugging the SFP module. I have not personally tested that, as I always turn the power off before plugging/unplugging anything.

I am rather happy to see that your 7030 is working fine. Keep in mind, that board has the Zynq 7030 SoC on it, which uses the MUCH more robust GTX transceivers. As such my proposed poor quality clock suggestion might be why you have a 7015/7030 difference (see errata doc).

A few of the reworks listed will include some signal integrity issues but most related to the transceivers are for clock quality issues. Without those reworks my 7015 would struggle to work as well.

I think we should be getting close to the end at this point. Let me know what you find and if needed, we can take the next step.

--Dan