|
Message
From: Tadej Markovic<tadejm@f...>
Date: Fri Feb 13 12:06:21 CET 2004
Subject: [pci] PCI Bridge Core bugs found
Hi,I forgot to comment following IF statement: (line 844 in pci_wb_master.v):
if (~pciw_fifo_empty_in) pciw_fifo_renable = 1'b0 ; // prepare next value (address when new trans., data when burst tran.) else pciw_fifo_renable = 1'b0 ;
One of the "pciw_fifo_renable" assignments had "1'b1" assigned and it was a bug. I have repaired that and at that time I left this IF statement as is, just for case if bug would have been somewhere else.
So this IF statement is correct and Synthesis tool don't have any problems optimising it.
Regards, Tadej
-----Original Message----- From: Tadej Markovic [mailto:tadejm@f...] Sent: 13. februar 2004 9:15 To: 'Discussion list about free,open source PCI IP core' Subject: RE: [pci] PCI Bridge Core bugs found
Hi,
I have some comments. You mentioned, that one of the state machines is asynchronous. Well, I'll try to explain, why this is not true.
What you were looking at, is "// state machine logic", where some of the control signals are changed regarding the "c_state" (current state of FSM) and regarding some other signals. In the same manner, the "n_state" (next state of FSM) is changed. "// state machine register control and registered outputs" is written in the part of code above "// state machine logic". In this register control part, all signals change values only when rising edge of "wb_clock_in" or "reset_in" occures. And the value of "n_state" is written into "c_state" in this part of code.
The pricpile of writing RTL code, which means Register Transfer Level, is that logical and register parts are written in separate blocks. This is not neccessary by it self, but it helps Synthesis and Implementation tools to make their jobs better, when for example Finite State Machine is not small.
I'm not saying, that my code can't have any bugs, I'm just saying that most of your conclusions are based on wrong image, how that part of described code works. I hope that explanation is helpful and I it would be very nice of you, if you could re-check again, because if there is a bug, it is in everybody's interest to remove it. But you must be carefull with constraints, when implemeting the PCI core, otherwise you can have problems with synchronization of signals on different clock domains.
Regards, Tadej
-----Original Message----- From: pci-bounces@o... [mailto:pci-bounces@o...]On Behalf Of Andriy Knysh Sent: 11. februar 2004 22:03 To: 'Discussion list about free, open source PCI IP core' Subject: [pci] PCI Bridge Core bugs found
Hello everybody,
I've got a question to all of you: have you ever synthesized and tested the PCI Bridge Core in a real application? Have you ever used the Core with Wishbone bus connected to a real device? I mean real application (FPGA or ASIC), not in a simulator, because in my experience almost everything works just fine in a simulator, but does not work in the real world.
I'm asking you this, because I have found a few major flaws in the Core, and in my opinion, the Core (this is mostly related to Wishbone bus) would never work in an FPGA or ASIC.
I have sent that e-mail to the authors (Tadej and Miha), but never got an answer back.
So, I'm asking you all if you have tested the core. If not, you should read the rest of the e-mail and check, if I'm wrong or right.
Thanks for your attention.
Regards, Andriy Knysh, Ph.D.
____________________________________________________________________________ _______ When I started to make my PCI board I was thinking about using
the Altera PCI Core, but obviously it was too expensive.
So I downloaded the PCI Bridge core from opencores.org and added Altera
Stratix
memory instances for the PCI and WB FIFOs (M4K memory blocks).
Now it is working very well with my PCI controller based on Altera Stratix
FPGA.
Unfortunately, during my design I stuck at some moment and after spending a
few days I found a few bugs in the PCI core.
First, in the pci_wb_master.v file. I noticed that when I write a value to
any of my registers (all my registers are 32 bit) in the
FPGA (not a configuration space register - this was working fine), I get the
address of the PCI transaction written to the
register instead of the data. Initially I thought it was a bug in my PC
driver or my application, but after some time I eventually
got to the file pci_wb_master.v of the PCI Bridge.
Here I have found a few bugs related to the problem mentioned above.
First, in the following state machine (line 733 in the original
pci_wb_master.v file)
// state machine logic
always@(c_state or
wb_ack_i or
wb_rty_i or
wb_err_i or
w_attempt or
r_attempt or
retried or
burst_chopped or
burst_chopped_delayed or
rty_i_delayed or
pci_tar_read_request or
rty_counter_almost_max_value or
set_retry or
last_data_to_pcir_fifo or
first_wb_data_access or
pciw_fifo_control_in or
pciw_fifo_empty_in or
burst_transfer or
last_data_from_pciw_fifo_reg
)
I have found the following (line 844 in the original file pci_wb_master.v):
if (~pciw_fifo_empty_in)
pciw_fifo_renable = 1'b0 ; // prepare next value (address when new
trans., data when burst tran.)
else
pciw_fifo_renable = 1'b0 ;
I guess this is a bug, since in order to read the next data from the FIFO it
should say:
if (~pciw_fifo_empty_in)
pciw_fifo_renable = 1'b1 ; // prepare next value (address when new
trans., data when burst tran.)
else
pciw_fifo_renable = 1'b0 ;
Second, in the following state machine (line 622 in the original file)
// state machine register control and registered outputs (without wb_adr_o
counter)
always@(posedge wb_clock_in or posedge reset_in)
begin
........................................................................
I have found this logic (line 659 in the original file):
if ((pciw_fifo_renable_out && ~addr_into_cnt) || addr_into_cnt_reg)
begin
wb_sel_o <= #`FF_DELAY ~pciw_fifo_cbe_in;
wb_dat_o <= #`FF_DELAY pciw_fifo_addr_data_in;
end
Here the authors are trying to write to wb_dat_o data output register when
one of the following conditions is met:
1) pciw_fifo_renable_out is 1 and addr_into_cnt is 0
2) addr_into_cnt_reg is 1
Let's consider these two conditions. The first one says that if
pciw_fifo_renable_out is 1 (e.g. reading from the FIFO
is enabled since pciw_fifo_renable_out = pciw_fifo_renable ||
addr_into_cnt_reg) and addr_into_cnt is 0.
This condition has no chance to be true for the following reasons:
On line 780 of the original file the authors wrote:
3'b100 : // Write request for PCIW_FIFO
to WB bus transaction
begin // If there is new transaction
if (burst_chopped_delayed)
begin
addr_into_cnt = 1'b0 ; //
address must not be latched into address counter
pciw_fifo_renable = 1'b1 ; //
first location is address (in FIFO), next will be data
end
else
begin
if
(pciw_fifo_control_in[`ADDR_CTRL_BIT])
addr_into_cnt = 1'b1 ; //
address must be latched into address counter
else
addr_into_cnt = 1'b0 ;
pciw_fifo_renable = 1'b1 ; //
first location is address (in FIFO), next will be data
end
read_count_load = 1'b0 ; // no need
for cache line when there is write
n_state = S_WRITE ;
Let's consider single write to a WB slave from PCI.
I have highlighted this in bold. Only here
pciw_fifo_renable is set to 1 and addr_into_cnt is
set to 0. It seems like it is a good chance for that
condition to be true, but at the same time you
change the state to S_WRITE. And, since the authors have made that state
machine asynchronous (not a good idea by itself),
the system goes to S_WRITE state immediately (line 821 in the original
file).
And, since neither wb_ack_i nor wb_err_i nor
wb_rty_I is set to 1 at that time, the system goes directly to default state
(line 908 in the original file). From here, if the
retry counter is disabled or it has not reached the max value yet), the line
927
(pciw_fifo_renable = 1'b0 ;) will be executed
immediately. At the next rising edge of the system clock (it is
unpredictable
when it comes since the state machine is
asynchronous), the state machine (line 622) will be in the S_WRITE state,
and line 659
will be executed, but the first condition will be
not met (since pciw_fifo_renable is 0 at that time).
The second condition ( .... ||
addr_into_cnt_reg) is met when, for example, it was an address phase on line
789
if (pciw_fifo_control_in[`ADDR_CTRL_BIT])
addr_into_cnt = 1'b1 ; // address must be latched into address counter
and then at the next rising edge of the system clock
(also asynchronous !!!) line 498 will be executed and addr_into_cnt_reg will
be set to
1 as well as on line 641 c_state will be set to
S_WRITE. Then on line 826 addr_into_cnt is immediately set to 0 (since it is
asynchronous),
but addr_into_cnt_reg remains 1 until the next
rising edge of the system clock when on line 659 the second condition is met
and the last
address from the FIFO is latched to wb_dat_o (
wb_dat_o <= #`FF_DELAY pciw_fifo_addr_data_in;).
This problem has its roots in the asynchronous
nature of the state machine when a signal changes its state before or at the
rising edge
of the system clock. This is a classical problem
with metastability - I have seen it a lot. What is interesting is that it
almost always works
in a simulator, but almost never in a real system
(simulators do not take into account that a signal changes its state not
instantly -
it is almost eternity in the world of electrons and
atoms).
To fix the problem I just rewrote the following
lines (659-663):
if (!addr_into_cnt)
begin
wb_sel_o <= #`FF_DELAY ~pciw_fifo_cbe_in;
wb_dat_o <= #`FF_DELAY
pciw_fifo_addr_data_in;
end
I think it is absolutely not necessary to check for
addr_into_cnt_reg or to check if we read from the FIFO.
It is only necessary to check if it is a data phase
(not address) so we can latch the new values into wb_dat_o and wb_sel_o.
After the changes have been done, the core is
working on Altera Stratix just fine without any problems.
Another problem that I found (again, it is related
to the asynchronous nature of the state machine) is in the
file pci_wb_slave.v, starting on line 651:
// state machine logic
always@(
c_state or
wattempt or .............................
Since the state machine has been made asynchronous,
all the assignments from line 680 to line 710 never get any chance to be
synthesized and executed (except, maybe in a
simulator !!!!!!!).
The only logic that will be synthesized is enclosed
in the case statement on line 712:
case (c_state)
S_IDLE: begin
..............................................
What is bothering me is that since the authors have
put some logic before the case statement, they had some algorithm in mind.
And, since that logic never get synthesized and
executed, that algorithm does not work, and I suspect there are some
major flaws in that implementation.
Again, thanks for your attention.
Andriy Knysh, Ph.D.
_______________________________________________
http://www.opencores.org/mailman/listinfo/pci
|
 |